Re: [PR] Optimized the overall performance of IoTDB [iotdb]

via GitHub Thu, 21 May 2026 18:18:07 -0700


JackieTien97 commented on code in PR #17664:
URL: https://github.com/apache/iotdb/pull/17664#discussion_r3285239990



##########
iotdb-client/isession/src/main/java/org/apache/iotdb/isession/SessionDataSet.java:
##########
@@ -153,45 +153,42 @@ public boolean hasNext() throws 
StatementExecutionException, IoTDBConnectionExce
   }
 
   private RowRecord constructRowRecordFromValueArray() throws 
StatementExecutionException {
-    List<Field> outFields = new ArrayList<>();
-    for (int i = ioTDBRpcDataSet.getValueColumnStartIndex();
-        i < ioTDBRpcDataSet.getColumnSize();
-        i++) {
+    int valueColumnStartIndex = ioTDBRpcDataSet.getValueColumnStartIndex();
+    int columnSize = ioTDBRpcDataSet.getColumnSize();
+    List<Field> outFields = new ArrayList<>(columnSize - 
valueColumnStartIndex);
+    for (int columnIndex = valueColumnStartIndex + 1; columnIndex <= 
columnSize; columnIndex++) {

Review Comment:
   Same index-base change as the C++ client: the loop is now 
`valueColumnStartIndex + 1 .. columnSize` (inclusive) feeding 1-based 
`isNull(columnIndex)` / `getDataType(columnIndex)` / `getXxx(columnIndex)`. 
Please confirm these index-based getters are 1-based relative to the 
column-name list (i.e. `columnIndex == oldNameListIndex + 1`). An off-by-one is 
silent here — it reads an adjacent column rather than failing. A unit test 
exercising a multi-column row would lock this down.



##########
iotdb-client/client-cpp/src/main/SessionDataSet.cpp:
##########
@@ -246,38 +246,39 @@ const std::vector<std::string>& 
SessionDataSet::DataIterator::getColumnTypeList(
 
 shared_ptr<RowRecord> SessionDataSet::constructRowRecordFromValueArray() {
   std::vector<Field> outFields;
-  for (int i = iotdbRpcDataSet_->getValueColumnStartIndex(); i < 
iotdbRpcDataSet_->getColumnSize();
-       i++) {
+  const int32_t valueColumnStartIndex = 
iotdbRpcDataSet_->getValueColumnStartIndex();
+  const int32_t columnSize = iotdbRpcDataSet_->getColumnSize();
+  outFields.reserve(columnSize - valueColumnStartIndex);
+  for (int32_t columnIndex = valueColumnStartIndex + 1; columnIndex <= 
columnSize; ++columnIndex) {

Review Comment:
   The loop changed from 0-based name-list indexing (`[valueColumnStartIndex, 
columnSize)`) to 1-based index access (`[valueColumnStartIndex + 1, 
columnSize]`). This is only correct if `isNullByIndex` / `getXxxByIndex` are 
1-based such that `columnIndex == oldNameListIndex + 1`. Please double-check 
this against `IoTDBRpcDataSet` — an off-by-one here silently returns the wrong 
column's value with no error.
   
   Also note `getBooleanByIndex(...).value()` will throw `bad_optional_access` 
if the optional is empty. It is guarded by `isNullByIndex` here, so it should 
be fine, but the `isNull` guard and the optional must agree on null semantics.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/metadata/fetcher/cache/TableDeviceSchemaCache.java:
##########
@@ -482,7 +482,7 @@ void invalidateLastCache(final PartialPath devicePath, 
final String measurement)
           },
           cachedDeviceID -> {
             try {
-              return new PartialPath(cachedDeviceID).matchFullPath(devicePath);
+              return devicePath.matchFullPath(cachedDeviceID);

Review Comment:
   This swaps the `matchFullPath` direction: previously `new 
PartialPath(cachedDeviceID).matchFullPath(devicePath)` (cached entry as the 
pattern), now `devicePath.matchFullPath(cachedDeviceID)` (devicePath as the 
pattern). The new direction is correct and consistent with `invalidateCache` 
below.
   
   Note this also changes behavior when `devicePath` contains wildcards: the 
old form would have failed to match (a concrete path cannot match a wildcard 
path when used as the pattern), so entries that should have been invalidated 
were left stale. If that is an intended bug fix, great — worth calling out in 
the PR description and covering with a test.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/utils/ModificationUtils.java:
##########
@@ -82,7 +82,11 @@ public static void modifyChunkMetaData(
               if (range.contains(metaData.getStartTime(), 
metaData.getEndTime())) {
                 return true;
               } else {
-                if (range.overlaps(new TimeRange(metaData.getStartTime(), 
metaData.getEndTime()))) {
+                if (overlap(

Review Comment:
   Replacing `range.overlaps(new TimeRange(start, end))` with `overlap(start, 
end, range.getMin(), range.getMax())` avoids the allocation — good. Please make 
sure `overlap(...)` uses exactly the same boundary semantics as 
`TimeRange.overlaps` (inclusive endpoints). A subtle inclusive/exclusive 
mismatch here changes whether a deletion is considered to touch a chunk, which 
directly affects query correctness.



##########
iotdb-client/session/src/main/java/org/apache/iotdb/session/Session.java:
##########
@@ -1865,10 +1815,12 @@ private void 
filterNullValueAndMeasurementWithStringType(
    */
   private boolean filterNullValueAndMeasurementWithStringType(
       List<String> valuesList, String deviceId, List<String> measurementsList) 
{
-    Map<String, Object> nullMap = new HashMap<>();
+    Map<String, Object> nullMap = logger.isInfoEnabled() ? new HashMap<>() : 
null;

Review Comment:
   Same as the other `nullMap` change above — `nullMap` is `null` when info 
logging is disabled. Confirm the code after this loop guards every `nullMap` 
access (null check, or `logger.isInfoEnabled()` short-circuit before 
`nullMap.isEmpty()`).



##########
iotdb-client/session/src/main/java/org/apache/iotdb/session/Session.java:
##########
@@ -1817,10 +1765,12 @@ private boolean filterNullValueAndMeasurement(
       List<String> measurementsList,
       List<TSDataType> types,
       List<Object> valuesList) {
-    Map<String, Object> nullMap = new HashMap<>();
+    Map<String, Object> nullMap = logger.isInfoEnabled() ? new HashMap<>() : 
null;

Review Comment:
   `nullMap` is now `null` whenever info logging is disabled. The loop body is 
correctly guarded, but the code after this loop (not in the diff) presumably 
consumes `nullMap` for logging — e.g. `if (!nullMap.isEmpty()) 
logger.info(...)`. If that consumer dereferences `nullMap` without a null check 
(or without short-circuiting on `logger.isInfoEnabled()` first), this throws an 
NPE when info logging is off. Please verify every subsequent use of `nullMap` 
in this method is null-safe.



##########
iotdb-client/service-rpc/src/main/java/org/apache/iotdb/rpc/IoTDBJDBCDataSet.java:
##########
@@ -321,6 +311,31 @@ public IoTDBJDBCDataSet(
     this.emptyResultSet = (queryDataSet == null || 
!queryDataSet.time.hasRemaining());
   }
 
+  private static int getDeduplicatedColumnSize(Map<String, Integer> 
columnNameIndex) {

Review Comment:
   `getDeduplicatedColumnSize` now returns `max(index) + 1` instead of the 
previous `distinct().count()`. These are equal only when the dedup indices are 
contiguous from 0 — which the old `columnTypeDeduplicatedList.set(index, ...)` 
already required (a gap would have thrown `IndexOutOfBoundsException`), so in 
practice this is equivalent and more robust. Just confirm no downstream 
consumer of `columnTypeDeduplicatedList` relies on `size()` being exactly the 
distinct count, or assumes no `null` entries.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/schemaengine/SchemaEngine.java:
##########
@@ -371,11 +371,13 @@ public int getSchemaRegionNumber() {
 
   public Map<Integer, Long> countDeviceNumBySchemaRegion(final List<Integer> 
schemaIds) {
     final Map<Integer, Long> deviceNum = new HashMap<>();
+    final java.util.Collection<Integer> targetSchemaIds =

Review Comment:
   Use regular imports for `java.util.Collection` / `java.util.HashSet` rather 
than inline fully-qualified names — the rest of the file uses imports and this 
is inconsistent (and may trip checkstyle). The same pattern appears in 
`StorageEngine.getDiskSizeByDataRegion`. The `size() > 1` guard to avoid 
allocating a `Set` for the trivial case is a reasonable micro-optimization.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/utils/QueryDataSetUtils.java:
##########
@@ -572,40 +558,30 @@ private static void fillRemainingBitMap(
   }
 
   private static void fillTimeColumn(
-      int rowCount, ByteArrayOutputStream[] byteArrayOutputStreams, 
TSQueryDataSet tsQueryDataSet) {
-    // calculate the time buffer size
-    int timeOccupation = rowCount * 8;
-    ByteBuffer timeBuffer = ByteBuffer.allocate(timeOccupation);
-    timeBuffer.put(byteArrayOutputStreams[0].toByteArray());
-    timeBuffer.flip();
-    tsQueryDataSet.setTime(timeBuffer);
+      int rowCount, PublicBAOS[] byteArrayOutputStreams, TSQueryDataSet 
tsQueryDataSet) {

Review Comment:
   After the refactor `rowCount` is no longer used in `fillTimeColumn` (the old 
`rowCount * 8` allocation is gone). The same applies to `rowCount` in 
`fillValueColumnsAndBitMaps` (line 566), where `valueOccupation` is now only 
read for `.length`. Please drop the now-unused parameters — they will otherwise 
trip checkstyle/IDE warnings and are misleading to readers. `columnNum` can be 
derived as `byteArrayOutputStreams.length / 2` if you want to remove 
`valueOccupation` entirely.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/tsfile/TsFileManager.java:
##########
@@ -223,10 +216,18 @@ public void removeAll(List<TsFileResource> 
tsFileResourceList, boolean sequence)
     writeLock("removeAll");
     try {
       for (TsFileResource resource : tsFileResourceList) {
-        remove(resource, sequence);
+        removeFromPartitionFileList(resource, sequence);
       }
     } finally {
-      writeLock("removeAll");
+      writeUnlock();
+    }
+  }
+
+  private void removeFromPartitionFileList(TsFileResource tsFileResource, 
boolean sequence) {
+    Map<Long, TsFileResourceList> selectedMap = sequence ? sequenceFiles : 
unsequenceFiles;
+    TsFileResourceList tsFileResources = 
selectedMap.get(tsFileResource.getTimePartition());
+    if (tsFileResources != null && tsFileResources.remove(tsFileResource)) {

Review Comment:
   `removeFromPartitionFileList` now looks up only the partition 
`tsFileResource.getTimePartition()` instead of scanning all partitions — a good 
O(1) improvement. It assumes a resource is always stored in the list keyed by 
its own `getTimePartition()`. That should hold as an invariant; just confirm 
there is no path where a resource's time partition can change after insertion, 
otherwise `remove` would silently fail to remove it (the old full scan would 
still have found it).



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/read/filescan/impl/UnclosedFileScanHandleImpl.java:
##########
@@ -48,6 +51,8 @@ public class UnclosedFileScanHandleImpl implements 
IFileScanHandle {
   private final TsFileResource tsFileResource;
   private final Map<IDeviceID, Map<String, List<IChunkMetadata>>> 
deviceToChunkMetadataMap;
   private final Map<IDeviceID, Map<String, List<IChunkHandle>>> 
deviceToMemChunkHandleMap;
+  private final Map<IDeviceID, List<TimeRange>> deviceToDeletionRanges;
+  private final Map<IDeviceID, Map<String, List<TimeRange>>> 
deviceToTimeSeriesDeletionRanges;

Review Comment:
   These two new caches are plain `HashMap`s, mutated lazily by 
`isDeviceTimeDeleted` / `isTimeSeriesTimeDeleted` (`put` and 
`computeIfAbsent`). If a single `UnclosedFileScanHandleImpl` can be queried 
from multiple threads concurrently — common in the scan/query path — these 
unsynchronized mutations can corrupt the map or lose updates. The PR's 
"concurrent read" checkbox is unchecked. Please confirm single-threaded access, 
or switch to `ConcurrentHashMap`.
   
   Separately: `appendDeletionRanges` adds `TimeRange` references straight from 
`chunkMetadata.getDeleteIntervalList()` and then `sortAndMerge` runs over them. 
Please confirm `sortAndMerge` does not mutate the shared `TimeRange` instances 
in place, otherwise it could corrupt the chunk metadata's delete intervals.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/read/filescan/impl/ClosedFileScanHandleImpl.java:
##########
@@ -87,10 +87,18 @@ public boolean isDeviceTimeDeleted(IDeviceID deviceID, long 
timestamp)
         curFileModEntries != null
             ? curFileModEntries
             : queryContext.loadAllModificationsFromDisk(tsFileResource);
-    List<ModEntry> modifications = 
queryContext.getPathModifications(curFileModEntries, deviceID);
     List<TimeRange> timeRangeList =
-        
modifications.stream().map(ModEntry::getTimeRange).collect(Collectors.toList());
-    return ModificationUtils.isPointDeletedWithoutOrderedRange(timestamp, 
timeRangeList);
+        
getMergedTimeRanges(queryContext.getPathModifications(curFileModEntries, 
deviceID));

Review Comment:
   `isDeviceTimeDeleted` recomputes `getPathModifications` and now also runs 
`sortAndMerge` on every call, with no caching — whereas 
`isTimeSeriesTimeDeleted` just below caches via `deviceToModifications`, and 
`UnclosedFileScanHandleImpl` caches both device- and series-level ranges. If 
`isDeviceTimeDeleted` is called per-timestamp, this is a small regression 
versus the old `isPointDeletedWithoutOrderedRange` (an extra sort on each 
call). Consider caching the device-level merged ranges here too, for 
consistency.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/StorageEngine.java:
##########
@@ -1116,12 +1116,14 @@ private void 
stopTimedServiceAndThrow(ScheduledExecutorService pool, String pool
 
   public void getDiskSizeByDataRegion(
       Map<Integer, Long> dataRegionDisk, List<Integer> dataRegionIds) {
-    dataRegionMap.forEach(
-        (dataRegionId, dataRegion) -> {
-          if (dataRegionIds.contains(dataRegionId.getId())) {
-            dataRegionDisk.put(dataRegionId.getId(), 
dataRegion.countRegionDiskSize());
-          }
-        });
+    final java.util.Collection<Integer> targetDataRegionIds =

Review Comment:
   Same fully-qualified-name nit as `SchemaEngine` — please import `Collection` 
/ `HashSet`. Logic-wise the rewrite (iterate the requested ids and look each 
up) is fine and equivalent: duplicates in `dataRegionIds` are handled by the 
`Set` when `size > 1`, and `dataRegionDisk.put` is idempotent regardless. 
Switching from a full-map scan to direct `get` is a good improvement.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/utils/QueryDataSetUtils.java:
##########
@@ -572,40 +558,30 @@ private static void fillRemainingBitMap(
   }
 
   private static void fillTimeColumn(
-      int rowCount, ByteArrayOutputStream[] byteArrayOutputStreams, 
TSQueryDataSet tsQueryDataSet) {
-    // calculate the time buffer size
-    int timeOccupation = rowCount * 8;
-    ByteBuffer timeBuffer = ByteBuffer.allocate(timeOccupation);
-    timeBuffer.put(byteArrayOutputStreams[0].toByteArray());
-    timeBuffer.flip();
-    tsQueryDataSet.setTime(timeBuffer);
+      int rowCount, PublicBAOS[] byteArrayOutputStreams, TSQueryDataSet 
tsQueryDataSet) {
+    tsQueryDataSet.setTime(wrapBuffer(byteArrayOutputStreams[0]));
   }
 
   private static void fillValueColumnsAndBitMaps(
       int rowCount,
-      ByteArrayOutputStream[] byteArrayOutputStreams,
+      PublicBAOS[] byteArrayOutputStreams,
       int[] valueOccupation,
       TSQueryDataSet tsQueryDataSet) {
-    // calculate the bitmap buffer size
-    int bitmapOccupation = (rowCount + 7) / 8;
-
-    List<ByteBuffer> bitmapList = new LinkedList<>();
-    List<ByteBuffer> valueList = new LinkedList<>();
+    int columnNum = valueOccupation.length;
+    List<ByteBuffer> bitmapList = new ArrayList<>(columnNum);
+    List<ByteBuffer> valueList = new ArrayList<>(columnNum);
     for (int i = 1; i < byteArrayOutputStreams.length; i += 2) {
-      ByteBuffer valueBuffer = ByteBuffer.allocate(valueOccupation[(i - 1) / 
2]);
-      valueBuffer.put(byteArrayOutputStreams[i].toByteArray());
-      valueBuffer.flip();
-      valueList.add(valueBuffer);
-
-      ByteBuffer bitmapBuffer = ByteBuffer.allocate(bitmapOccupation);
-      bitmapBuffer.put(byteArrayOutputStreams[i + 1].toByteArray());
-      bitmapBuffer.flip();
-      bitmapList.add(bitmapBuffer);
+      valueList.add(wrapBuffer(byteArrayOutputStreams[i]));
+      bitmapList.add(wrapBuffer(byteArrayOutputStreams[i + 1]));
     }
     tsQueryDataSet.setBitmapList(bitmapList);
     tsQueryDataSet.setValueList(valueList);
   }
 
+  private static ByteBuffer wrapBuffer(PublicBAOS outputStream) {

Review Comment:
   Nice zero-copy win — wrapping the `PublicBAOS` internal buffer avoids the 
`toByteArray()` + `allocate` + `put` double copy. One invariant to keep: the 
returned `ByteBuffer` aliases the stream's internal array, so nothing must 
write to these `PublicBAOS` / `DataOutputStream` instances after `wrapBuffer` 
is called. That holds in the current flow (wrapping happens last, right before 
return), just keep it in mind if this code is extended later.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/tsfile/TsFileManager.java:
##########
@@ -223,10 +216,18 @@ public void removeAll(List<TsFileResource> 
tsFileResourceList, boolean sequence)
     writeLock("removeAll");
     try {
       for (TsFileResource resource : tsFileResourceList) {
-        remove(resource, sequence);
+        removeFromPartitionFileList(resource, sequence);
       }
     } finally {
-      writeLock("removeAll");
+      writeUnlock();
+    }

Review Comment:
   Good catch — the original `removeAll` `finally` block called 
`writeLock("removeAll")` instead of `writeUnlock()`, which acquired the write 
lock a second time and never released it (a write-lock count leak on every 
`removeAll`). This fix is correct and important; worth highlighting explicitly 
in the PR description since it is a real bug fix, not just an optimization.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Optimized the overall performance of IoTDB [iotdb]

Reply via email to