jt2594838 commented on code in PR #13168:
URL: https://github.com/apache/iotdb/pull/13168#discussion_r1716461197
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tsfile/container/scan/TsFileInsertionScanDataContainer.java:
##########
@@ -332,35 +331,54 @@ private void moveToNextChunkReader() throws IOException,
IllegalStateException {
return;
case MetaMarker.VALUE_CHUNK_HEADER:
case MetaMarker.ONLY_ONE_PAGE_VALUE_CHUNK_HEADER:
- chunkHeader = tsFileSequenceReader.readChunkHeader(marker);
+ if (Objects.isNull(firstChunkHeader4NextSequentialValueChunks)) {
+ chunkHeader = tsFileSequenceReader.readChunkHeader(marker);
- if (Objects.isNull(currentDevice)
- || !pattern.matchesMeasurement(currentDevice,
chunkHeader.getMeasurementID())) {
- tsFileSequenceReader.position(
- tsFileSequenceReader.position() + chunkHeader.getDataSize());
- break;
- }
+ // Do not record empty chunks
+ if (chunkHeader.getDataSize() == 0) {
+ break;
+ }
Review Comment:
Actually, empty chunks SHOULD also be recorded. Consider the following case:
There are two measurements in an aligned series: s1 and s2, each of them has
two chunks.
They are grouped by 1 series 1 group, and then the result will be:
[TimeChunk1] [s1 Chunk1] [TimeChunk2] [s1 Chunk2] [s2 Chunk1] [s2 Chunk2].
Let us assume the [s2 Chunk1] to be empty. If you do not record it, the
associated time chunk of [s2 Chunk2] will be marked as [TimeChunk1], which is
incorrect.
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tsfile/container/scan/TsFileInsertionScanDataContainer.java:
##########
@@ -332,35 +331,54 @@ private void moveToNextChunkReader() throws IOException,
IllegalStateException {
return;
case MetaMarker.VALUE_CHUNK_HEADER:
case MetaMarker.ONLY_ONE_PAGE_VALUE_CHUNK_HEADER:
- chunkHeader = tsFileSequenceReader.readChunkHeader(marker);
+ if (Objects.isNull(firstChunkHeader4NextSequentialValueChunks)) {
+ chunkHeader = tsFileSequenceReader.readChunkHeader(marker);
- if (Objects.isNull(currentDevice)
- || !pattern.matchesMeasurement(currentDevice,
chunkHeader.getMeasurementID())) {
- tsFileSequenceReader.position(
- tsFileSequenceReader.position() + chunkHeader.getDataSize());
- break;
- }
+ // Do not record empty chunks
+ if (chunkHeader.getDataSize() == 0) {
+ break;
+ }
- // Do not record empty chunk
- if (chunkHeader.getDataSize() > 0) {
- valueChunkList.add(
- new Chunk(
- chunkHeader, tsFileSequenceReader.readChunk(-1,
chunkHeader.getDataSize())));
- currentMeasurements.add(
- new MeasurementSchema(chunkHeader.getMeasurementID(),
chunkHeader.getDataType()));
+ if (Objects.isNull(currentDevice)
+ || !pattern.matchesMeasurement(currentDevice,
chunkHeader.getMeasurementID())) {
+ tsFileSequenceReader.position(
+ tsFileSequenceReader.position() + chunkHeader.getDataSize());
+ break;
+ }
Review Comment:
Here too. Even if a chunk is filtered out, `measurementIndexMap` should also
be updated. Otherwise, you may link a value chunk to a wrong time chunk.
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tsfile/container/scan/TsFileInsertionScanDataContainer.java:
##########
@@ -373,14 +391,22 @@ private void moveToNextChunkReader() throws IOException,
IllegalStateException {
}
lastMarker = marker;
- if (Objects.nonNull(timeChunk) && !currentMeasurements.isEmpty()) {
+ if (!recordAlignedChunk(valueChunkList, marker)) {
+ chunkReader = null;
+ }
+ }
+
+ private boolean recordAlignedChunk(final List<Chunk> valueChunkList, final
byte marker)
+ throws IOException {
+ if (!valueChunkList.isEmpty()) {
chunkReader =
isMultiPage
- ? new AlignedChunkReader(timeChunk, valueChunkList, filter)
- : new AlignedSinglePageWholeChunkReader(timeChunk,
valueChunkList);
+ ? new AlignedChunkReader(timeChunkList.get(lastIndex),
valueChunkList, filter)
+ : new
AlignedSinglePageWholeChunkReader(timeChunkList.get(lastIndex), valueChunkList);
currentIsAligned = true;
- } else {
- chunkReader = null;
+ lastMarker = marker;
+ return true;
}
+ return false;
Review Comment:
The meaning of the return value is not clear enough. Some comments would
better be added to the return value of this method. And/Or:
```java
boolean shouldReturn;
// the previous block
return shouldReturn;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]