Re: [PR] Optimized the overall performance of IoTDB [iotdb]

via GitHub Sun, 24 May 2026 20:15:07 -0700


Caideyipi commented on code in PR #17664:
URL: https://github.com/apache/iotdb/pull/17664#discussion_r3295885803



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/read/filescan/impl/ClosedFileScanHandleImpl.java:
##########
@@ -87,10 +89,22 @@ public boolean isDeviceTimeDeleted(IDeviceID deviceID, long 
timestamp)
         curFileModEntries != null
             ? curFileModEntries
             : queryContext.loadAllModificationsFromDisk(tsFileResource);
-    List<ModEntry> modifications = 
queryContext.getPathModifications(curFileModEntries, deviceID);
-    List<TimeRange> timeRangeList =
-        
modifications.stream().map(ModEntry::getTimeRange).collect(Collectors.toList());
-    return ModificationUtils.isPointDeletedWithoutOrderedRange(timestamp, 
timeRangeList);
+    List<TimeRange> timeRangeList = deviceToDeletionRanges.get(deviceID);
+    if (timeRangeList == null) {
+      timeRangeList =
+          
getMergedTimeRanges(queryContext.getPathModifications(curFileModEntries, 
deviceID));
+      deviceToDeletionRanges.put(deviceID, timeRangeList);
+    }
+    return ModificationUtils.isPointDeleted(timestamp, timeRangeList);
+  }

Review Comment:
   Updated. I switched the caches to ConcurrentHashMap and use putIfAbsent 
here. I kept the merge outside computeIfAbsent so the relatively expensive mod 
collection and merge is not executed while holding a CHM bin lock; duplicate 
concurrent computation is harmless. The added memory is per scan handle and 
only for queried devices, storing merged ranges instead of raw mods, and is 
released with the handle.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/StorageEngine.java:
##########
@@ -1086,9 +1086,11 @@ private void 
stopTimedServiceAndThrow(ScheduledExecutorService pool, String pool
 
   public void getDiskSizeByDataRegion(
       Map<Integer, Long> dataRegionDisk, List<Integer> dataRegionIds) {
+    final java.util.Collection<Integer> targetDataRegionIds =
+        dataRegionIds.size() > 1 ? new java.util.HashSet<>(dataRegionIds) : 
dataRegionIds;
     dataRegionMap.forEach(
         (dataRegionId, dataRegion) -> {
-          if (dataRegionIds.contains(dataRegionId.getId())) {
+          if (targetDataRegionIds.contains(dataRegionId.getId())) {
             dataRegionDisk.put(dataRegionId.getId(), 
dataRegion.countRegionDiskSize());
           }

Review Comment:
   Done. getDiskSizeByDataRegion now iterates the requested ids and does a 
direct dataRegionMap.get(new DataRegionId(id)). I avoided computeIfPresent 
since this is read-only and should not mutate the map.



##########
iotdb-client/client-cpp/src/main/SessionDataSet.cpp:
##########
@@ -246,38 +246,39 @@ const std::vector<std::string>& 
SessionDataSet::DataIterator::getColumnTypeList(
 
 shared_ptr<RowRecord> SessionDataSet::constructRowRecordFromValueArray() {
   std::vector<Field> outFields;
-  for (int i = iotdbRpcDataSet_->getValueColumnStartIndex(); i < 
iotdbRpcDataSet_->getColumnSize();
-       i++) {
+  const int32_t valueColumnStartIndex = 
iotdbRpcDataSet_->getValueColumnStartIndex();
+  const int32_t columnSize = iotdbRpcDataSet_->getColumnSize();
+  outFields.reserve(columnSize - valueColumnStartIndex);
+  for (int32_t columnIndex = valueColumnStartIndex + 1; columnIndex <= 
columnSize; ++columnIndex) {

Review Comment:
   Confirmed against C++ IoTDBRpcDataSet: getTsBlockColumnIndexForColumnIndex 
subtracts 1, and findColumnNameByIndex rejects index <= 0, so these getters are 
1-based. The null guard uses the same column-index path as the typed getters.



##########
iotdb-client/isession/src/main/java/org/apache/iotdb/isession/SessionDataSet.java:
##########
@@ -153,45 +153,42 @@ public boolean hasNext() throws 
StatementExecutionException, IoTDBConnectionExce
   }
 
   private RowRecord constructRowRecordFromValueArray() throws 
StatementExecutionException {
-    List<Field> outFields = new ArrayList<>();
-    for (int i = ioTDBRpcDataSet.getValueColumnStartIndex();
-        i < ioTDBRpcDataSet.getColumnSize();
-        i++) {
+    int valueColumnStartIndex = ioTDBRpcDataSet.getValueColumnStartIndex();
+    int columnSize = ioTDBRpcDataSet.getColumnSize();
+    List<Field> outFields = new ArrayList<>(columnSize - 
valueColumnStartIndex);
+    for (int columnIndex = valueColumnStartIndex + 1; columnIndex <= 
columnSize; columnIndex++) {

Review Comment:
   Confirmed against Java IoTDBRpcDataSet: findColumnNameByIndex and 
getTsBlockColumnIndexForColumnIndex are 1-based and access 
columnNameList.get(columnIndex - 1). Existing multi-column client coverage also 
uses 1-based indexes.



##########
iotdb-client/service-rpc/src/main/java/org/apache/iotdb/rpc/IoTDBJDBCDataSet.java:
##########
@@ -321,6 +311,31 @@ public IoTDBJDBCDataSet(
     this.emptyResultSet = (queryDataSet == null || 
!queryDataSet.time.hasRemaining());
   }
 
+  private static int getDeduplicatedColumnSize(Map<String, Integer> 
columnNameIndex) {

Review Comment:
   Confirmed. Downstream uses columnTypeDeduplicatedList.size() to allocate 
arrays indexed by the deduplicated ordinal, so max(index) + 1 is the right 
size. Valid input has contiguous ordinals and no null entries; with a gap, the 
previous set(index, ...) path would already have failed.



##########
iotdb-client/session/src/main/java/org/apache/iotdb/session/Session.java:
##########
@@ -1817,10 +1765,12 @@ private boolean filterNullValueAndMeasurement(
       List<String> measurementsList,
       List<TSDataType> types,
       List<Object> valuesList) {
-    Map<String, Object> nullMap = new HashMap<>();
+    Map<String, Object> nullMap = logger.isInfoEnabled() ? new HashMap<>() : 
null;

Review Comment:
   Fixed in latest push. The post-loop logging in filterNullValueAndMeasurement 
now checks nullMap != null before logging, so the info-disabled path cannot 
dereference it.



##########
iotdb-client/session/src/main/java/org/apache/iotdb/session/Session.java:
##########
@@ -1865,10 +1815,12 @@ private void 
filterNullValueAndMeasurementWithStringType(
    */
   private boolean filterNullValueAndMeasurementWithStringType(
       List<String> valuesList, String deviceId, List<String> measurementsList) 
{
-    Map<String, Object> nullMap = new HashMap<>();
+    Map<String, Object> nullMap = logger.isInfoEnabled() ? new HashMap<>() : 
null;

Review Comment:
   Fixed in latest push. The string-type variant now has the same null-safe 
post-loop logging guard: nullMap != null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Optimized the overall performance of IoTDB [iotdb]

Reply via email to