[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377524587 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -254,64 +260,51 @@ public void endFile(Schema schema) throws IOException { // close file out.close(); canWrite = false; +timeseriesMetadataMap = new TreeMap<>(); logger.info("output stream is closed"); } /** - * 1. group chunkGroupMetaDataList to TsDeviceMetadata 2. flush TsDeviceMetadata 3. get - * TsDeviceMetadataIndex - * - * @param chunkGroupMetaDataList all chunk group metadata in memory - * @return TsDeviceMetadataIndex in TsFileMetaData + * @return tsOffsets in TsFileMetaData Review comment: Please add the java doc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377526509 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -254,64 +260,51 @@ public void endFile(Schema schema) throws IOException { // close file out.close(); canWrite = false; +timeseriesMetadataMap = new TreeMap<>(); logger.info("output stream is closed"); } /** - * 1. group chunkGroupMetaDataList to TsDeviceMetadata 2. flush TsDeviceMetadata 3. get - * TsDeviceMetadataIndex - * - * @param chunkGroupMetaDataList all chunk group metadata in memory - * @return TsDeviceMetadataIndex in TsFileMetaData + * @return tsOffsets in TsFileMetaData */ - private Map flushTsDeviceMetaDataAndGetIndex( - List chunkGroupMetaDataList) throws IOException { - -Map tsDeviceMetadataIndexMap = new HashMap<>(); - -long offset; /* offset for the flushing TsDeviceMetadata */ - -TsDeviceMetadata currentTsDeviceMetadata; - -// flush TsDeviceMetadata by string order of deviceId -for (Map.Entry entry : getAllTsDeviceMetadata(chunkGroupMetaDataList) -.entrySet()) { - // update statistics in TsDeviceMetadata - currentTsDeviceMetadata = entry.getValue(); - - // flush tsChunkGroupBlockMetaData - offset = out.getPosition(); - int size = currentTsDeviceMetadata.serializeTo(out.wrapAsStream()); - - TsDeviceMetadataIndex tsDeviceMetadataIndex = new TsDeviceMetadataIndex(offset, size, - currentTsDeviceMetadata); - tsDeviceMetadataIndexMap.put(entry.getKey(), tsDeviceMetadataIndex); + private long[] flushAllChunkMetadataList() throws IOException { +if (timeseriesMetadataMap.size() == 0) { + return new long[0]; } - -return tsDeviceMetadataIndexMap; - } - - /** - * group all chunk group metadata by device. - * - * @param chunkGroupMetaDataList all chunk group metadata - * @return TsDeviceMetadata of all devices - */ - private TreeMap getAllTsDeviceMetadata( - List chunkGroupMetaDataList) { -String currentDevice; -TreeMap tsDeviceMetadataMap = new TreeMap<>(); - -for (ChunkGroupMetaData chunkGroupMetaData : chunkGroupMetaDataList) { - currentDevice = chunkGroupMetaData.getDeviceID(); - - if (!tsDeviceMetadataMap.containsKey(currentDevice)) { -TsDeviceMetadata tsDeviceMetadata = new TsDeviceMetadata(); -tsDeviceMetadataMap.put(currentDevice, tsDeviceMetadata); +// convert ChunkMetadataList to this field +long[] tsOffsets = new long[timeseriesMetadataMap.size() + 1]; +List tsMetadataList = new ArrayList<>(); +// flush timeseriesMetadataList one by one +int i = 0; +for (Map.Entry> entry : timeseriesMetadataMap.entrySet()) { + Path path = entry.getKey(); + String deviceId = path.getDevice(); + if (!deviceOffsetsMap.containsKey(deviceId)) { +deviceOffsetsMap.put(deviceId, new int[2]); +deviceOffsetsMap.get(deviceId)[0] = i; + } Review comment: Please use the following codes instead to reduce the operation and improve the efficiency. ``` deviceOffsetsMap.computeIfAbsent(deviceId, k -> new int[2])[0] = i; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377527825 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -202,13 +204,19 @@ public void writeChunk(Chunk chunk, ChunkMetaData chunkMetadata) throws IOExcept * end chunk and write some log. */ public void endCurrentChunk() { -currentChunkGroupMetaData.addTimeSeriesChunkMetaData(currentChunkMetaData); +chunkMetaDataList.add(currentChunkMetaData); +Path path = new Path(deviceId, currentChunkMetaData.getMeasurementUid()); +List chunkMetaDataListOfOnePath = timeseriesMetadataMap.getOrDefault(path, +new ArrayList()); +chunkMetaDataListOfOnePath.add(currentChunkMetaData); +timeseriesMetadataMap.put(path, chunkMetaDataListOfOnePath); Review comment: Please use the following codes instead to reduce the operation and improve the efficiency. ``` timeseriesMetadataMap.computeIfAbsent(path, k -> new ArrayList<>()).add(currentChunkMetaData); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377533688 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/file/metadata/TimeseriesMetaData.java ## @@ -0,0 +1,117 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iotdb.tsfile.file.metadata; + +import java.io.IOException; +import java.io.OutputStream; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.List; + +import org.apache.iotdb.tsfile.utils.ReadWriteIOUtils; + +public class TimeseriesMetaData { + + private long startOffsetOfChunkMetaDataList; + private int chunkMetaDataListDataSize; + private int numOfChunkMetaDatas; + + private String measurementId; + private List chunkMetaDataList = new ArrayList<>(); + + public TimeseriesMetaData() { + + } + public TimeseriesMetaData(String measurementId, List chunkMetaDataList) { +this.measurementId = measurementId; +this.chunkMetaDataList = chunkMetaDataList; +this.numOfChunkMetaDatas = chunkMetaDataList.size(); + } Review comment: If the constructor is useless, please remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377522265 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -69,14 +67,19 @@ } protected TsFileOutput out; - protected List chunkGroupMetaDataList = new ArrayList<>(); protected boolean canWrite = true; protected int totalChunkNum = 0; protected int invalidChunkNum; protected File file; - private ChunkGroupMetaData currentChunkGroupMetaData; + protected List> chunkGroupMetaDataList = new ArrayList<>(); + protected List deviceList = new ArrayList<>(); + protected List chunkMetaDataList = new ArrayList<>(); + private static Map> timeseriesMetadataMap = new TreeMap<>(); private ChunkMetaData currentChunkMetaData; private long markedPosition; + private static Map deviceOffsetsMap = new HashMap<>(); Review comment: Please remove `static`. The reason is the same as above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377523969 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -254,64 +260,51 @@ public void endFile(Schema schema) throws IOException { // close file out.close(); canWrite = false; +timeseriesMetadataMap = new TreeMap<>(); logger.info("output stream is closed"); } /** - * 1. group chunkGroupMetaDataList to TsDeviceMetadata 2. flush TsDeviceMetadata 3. get - * TsDeviceMetadataIndex - * - * @param chunkGroupMetaDataList all chunk group metadata in memory - * @return TsDeviceMetadataIndex in TsFileMetaData + * @return tsOffsets in TsFileMetaData */ - private Map flushTsDeviceMetaDataAndGetIndex( - List chunkGroupMetaDataList) throws IOException { - -Map tsDeviceMetadataIndexMap = new HashMap<>(); - -long offset; /* offset for the flushing TsDeviceMetadata */ - -TsDeviceMetadata currentTsDeviceMetadata; - -// flush TsDeviceMetadata by string order of deviceId -for (Map.Entry entry : getAllTsDeviceMetadata(chunkGroupMetaDataList) -.entrySet()) { - // update statistics in TsDeviceMetadata - currentTsDeviceMetadata = entry.getValue(); - - // flush tsChunkGroupBlockMetaData - offset = out.getPosition(); - int size = currentTsDeviceMetadata.serializeTo(out.wrapAsStream()); - - TsDeviceMetadataIndex tsDeviceMetadataIndex = new TsDeviceMetadataIndex(offset, size, - currentTsDeviceMetadata); - tsDeviceMetadataIndexMap.put(entry.getKey(), tsDeviceMetadataIndex); + private long[] flushAllChunkMetadataList() throws IOException { +if (timeseriesMetadataMap.size() == 0) { + return new long[0]; } - -return tsDeviceMetadataIndexMap; - } - - /** - * group all chunk group metadata by device. - * - * @param chunkGroupMetaDataList all chunk group metadata - * @return TsDeviceMetadata of all devices - */ - private TreeMap getAllTsDeviceMetadata( - List chunkGroupMetaDataList) { -String currentDevice; -TreeMap tsDeviceMetadataMap = new TreeMap<>(); - -for (ChunkGroupMetaData chunkGroupMetaData : chunkGroupMetaDataList) { - currentDevice = chunkGroupMetaData.getDeviceID(); - - if (!tsDeviceMetadataMap.containsKey(currentDevice)) { -TsDeviceMetadata tsDeviceMetadata = new TsDeviceMetadata(); -tsDeviceMetadataMap.put(currentDevice, tsDeviceMetadata); +// convert ChunkMetadataList to this field +long[] tsOffsets = new long[timeseriesMetadataMap.size() + 1]; +List tsMetadataList = new ArrayList<>(); +// flush timeseriesMetadataList one by one +int i = 0; +for (Map.Entry> entry : timeseriesMetadataMap.entrySet()) { + Path path = entry.getKey(); + String deviceId = path.getDevice(); + if (!deviceOffsetsMap.containsKey(deviceId)) { +deviceOffsetsMap.put(deviceId, new int[2]); +deviceOffsetsMap.get(deviceId)[0] = i; + } + deviceOffsetsMap.get(deviceId)[1] = i + 1; + TimeseriesMetaData tsMetaData = new TimeseriesMetaData(); + + tsMetaData.setMeasurementId(path.getMeasurement()); + tsMetaData.setOffsetOfChunkMetaDataList(out.getPosition()); + int chunkMetadataSize = 0; + for (ChunkMetaData chunkMetadata : entry.getValue()) { +chunkMetadataSize += chunkMetadata.serializeTo(out.wrapAsStream()); } - tsDeviceMetadataMap.get(currentDevice).addChunkGroupMetaData(chunkGroupMetaData); + tsMetaData.setDataSizeOfChunkMetaDataList(chunkMetadataSize); + tsMetaData.setNumOfChunkMetaDatas(entry.getValue().size()); + tsMetadataList.add(tsMetaData); + i++; } -return tsDeviceMetadataMap; + +for (i = 0; i < tsMetadataList.size(); i++) { + tsOffsets[i] = out.getPosition(); + int size = tsMetadataList.get(i).serializeTo(out.wrapAsStream()); + tsOffsets[i+1] = tsOffsets[i] + size; +} Review comment: Please remove useless assignment and addition operations. ``` if(i > 0){ tsOffsets[0] = out.getPosition(); } for (i = 0; i < tsMetadataList.size(); i++) { int size = tsMetadataList.get(i).serializeTo(out.wrapAsStream()); tsOffsets[i+1] = out.getPosition(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377529080 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -129,54 +131,53 @@ protected void startFile() throws IOException { * @param deviceId device id */ public void startChunkGroup(String deviceId) throws IOException { +this.deviceId = deviceId; +currentChunkGroupStartOffset = out.getPosition(); logger.debug("start chunk group:{}, file position {}", deviceId, out.getPosition()); -currentChunkGroupMetaData = new ChunkGroupMetaData(deviceId, new ArrayList<>(), -out.getPosition()); +chunkMetaDataList = new ArrayList<>(); } /** * end chunk and write some log. * If there is no data in the chunk group, nothing will be flushed. */ public void endChunkGroup(long version) throws IOException { Review comment: If `version` is useless, please remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377531010 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/chunk/IChunkGroupWriter.java ## @@ -25,7 +25,7 @@ import org.apache.iotdb.tsfile.file.footer.ChunkGroupFooter; import org.apache.iotdb.tsfile.write.record.RowBatch; import org.apache.iotdb.tsfile.write.record.datapoint.DataPoint; -import org.apache.iotdb.tsfile.write.schema.MeasurementSchema; +import org.apache.iotdb.tsfile.write.schema.TimeseriesSchema; Review comment: Please remove useless import. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377521432 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/writer/TsFileIOWriter.java ## @@ -69,14 +67,19 @@ } protected TsFileOutput out; - protected List chunkGroupMetaDataList = new ArrayList<>(); protected boolean canWrite = true; protected int totalChunkNum = 0; protected int invalidChunkNum; protected File file; - private ChunkGroupMetaData currentChunkGroupMetaData; + protected List> chunkGroupMetaDataList = new ArrayList<>(); + protected List deviceList = new ArrayList<>(); + protected List chunkMetaDataList = new ArrayList<>(); + private static Map> timeseriesMetadataMap = new TreeMap<>(); Review comment: Why add `static`? I don't think all `TsFileIOWriter` instances should share this variable, which will lead to serious bugs. Please remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-iotdb] fanhualta commented on a change in pull request #736: Refactor TsFile
fanhualta commented on a change in pull request #736: Refactor TsFile URL: https://github.com/apache/incubator-iotdb/pull/736#discussion_r377530666 ## File path: tsfile/src/main/java/org/apache/iotdb/tsfile/write/chunk/ChunkGroupWriterImpl.java ## @@ -31,7 +31,7 @@ import org.apache.iotdb.tsfile.utils.Binary; import org.apache.iotdb.tsfile.write.record.RowBatch; import org.apache.iotdb.tsfile.write.record.datapoint.DataPoint; -import org.apache.iotdb.tsfile.write.schema.MeasurementSchema; +import org.apache.iotdb.tsfile.write.schema.TimeseriesSchema; Review comment: Please remove useless import. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services