Hi, I have finished working on refactoring TsFile storage on HDFS. Here is my 
PR [1].

TSFile storage on different file systems are implemented by factory pattern. 
When starting IoTDB, FSFactoryProducer will produce different factories 
according to user configuration: `LocalFSFactory`, `LocalFSInputFactory` and 
`LocalFSOutputFactory` for local file system, while `HDFSFactory`, 
`HDFSInputFactory` and `HDFSOutputFactory` for HDFS.

From now on, TSFile module will not depend on Hadoop libs. If you want to store 
TSFile and related data files in HDFS, here are the steps:

1. Build server and Hadoop module by: `mvn clean package -pl server,hadoop -am 
-Dmaven.test.skip=true`

2. Then, copy the target jar of Hadoop module 
`hadoop-tsfile-0.9.0-SNAPSHOT-jar-with-dependencies.jar` into server target lib 
folder `.../server/target/iotdb-server-0.9.0-SNAPSHOT/lib`.

3. Edit user config in `iotdb-engine.properties`. Start server, and Tsfile will 
be stored on HDFS.

If you'd like to reset storage file system to local, just edit configuration 
`tsfile_storage_fs` to `LOCAL`. In this situation, if you have already had some 
data files on HDFS, you should either download them to local and move them to 
your config data file folder 
(`../server/target/iotdb-server-0.9.0-SNAPSHOT/data/data` by default), or 
restart your process and import data to IoTDB.

After the change of document structure is finished and merged into master, I 
will add a detailed User Guide about shared storage architecture to the 
document (Here is the draft: [2] for English and [3] for Chinese).

If you have any suggestions and ideas, please discuss with me.


P.S. The merging of PR [1] may result in conflicts in other recent PRs because 
of the refactor… If you meet conflicts in your PR, I’ll be very glad to help 
you resolve them. : )


[1] https://github.com/apache/incubator-iotdb/pull/417 
<https://github.com/apache/incubator-iotdb/pull/417>
[2] 
https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md
 
<https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md>
[3] 
https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md
 
<https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md>


Best,
-----------------------------------
Zesong Sun
School of Software, Tsinghua University

孙泽嵩
清华大学 软件学院

> 2019年9月22日 18:23,Zesong Sun (Jira) <[email protected]> 写道:
> 
> Zesong Sun created IOTDB-234:
> --------------------------------
> 
>             Summary: Refactor TsFile storage on HDFS
>                 Key: IOTDB-234
>                 URL: https://issues.apache.org/jira/browse/IOTDB-234
>             Project: Apache IoTDB
>          Issue Type: Improvement
>            Reporter: Zesong Sun
> 
> 
> Refactor TsFile storage on HDFS codes:
> * Extract the FileSystem factories into Hadoop module from TSFile module, so 
> that TSFile module will not depend on Hadoop libs.
> * Use Java Reflection to get FileSystem factories in TSFile module.
> 
> 
> 
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)

Reply via email to