Hi, I have finished working on refactoring TsFile storage on HDFS. Here is my PR [1].
TSFile storage on different file systems are implemented by factory pattern. When starting IoTDB, FSFactoryProducer will produce different factories according to user configuration: `LocalFSFactory`, `LocalFSInputFactory` and `LocalFSOutputFactory` for local file system, while `HDFSFactory`, `HDFSInputFactory` and `HDFSOutputFactory` for HDFS. From now on, TSFile module will not depend on Hadoop libs. If you want to store TSFile and related data files in HDFS, here are the steps: 1. Build server and Hadoop module by: `mvn clean package -pl server,hadoop -am -Dmaven.test.skip=true` 2. Then, copy the target jar of Hadoop module `hadoop-tsfile-0.9.0-SNAPSHOT-jar-with-dependencies.jar` into server target lib folder `.../server/target/iotdb-server-0.9.0-SNAPSHOT/lib`. 3. Edit user config in `iotdb-engine.properties`. Start server, and Tsfile will be stored on HDFS. If you'd like to reset storage file system to local, just edit configuration `tsfile_storage_fs` to `LOCAL`. In this situation, if you have already had some data files on HDFS, you should either download them to local and move them to your config data file folder (`../server/target/iotdb-server-0.9.0-SNAPSHOT/data/data` by default), or restart your process and import data to IoTDB. After the change of document structure is finished and merged into master, I will add a detailed User Guide about shared storage architecture to the document (Here is the draft: [2] for English and [3] for Chinese). If you have any suggestions and ideas, please discuss with me. P.S. The merging of PR [1] may result in conflicts in other recent PRs because of the refactor… If you meet conflicts in your PR, I’ll be very glad to help you resolve them. : ) [1] https://github.com/apache/incubator-iotdb/pull/417 <https://github.com/apache/incubator-iotdb/pull/417> [2] https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md <https://github.com/samperson1997/incubator-iotdb/blob/6a8b8188c29e158d35652ddf14fdb5267329606c/docs/Documentation/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md> [3] https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md <https://github.com/samperson1997/incubator-iotdb/blob/hdfs_doc/docs/Documentation-CHN/UserGuide/10-Distributed%20Architecture/1-TsFile%20Storage.md> Best, ----------------------------------- Zesong Sun School of Software, Tsinghua University 孙泽嵩 清华大学 软件学院 > 2019年9月22日 18:23,Zesong Sun (Jira) <[email protected]> 写道: > > Zesong Sun created IOTDB-234: > -------------------------------- > > Summary: Refactor TsFile storage on HDFS > Key: IOTDB-234 > URL: https://issues.apache.org/jira/browse/IOTDB-234 > Project: Apache IoTDB > Issue Type: Improvement > Reporter: Zesong Sun > > > Refactor TsFile storage on HDFS codes: > * Extract the FileSystem factories into Hadoop module from TSFile module, so > that TSFile module will not depend on Hadoop libs. > * Use Java Reflection to get FileSystem factories in TSFile module. > > > > -- > This message was sent by Atlassian Jira > (v8.3.4#803005)
