[GitHub] sohami commented on a change in pull request #1600: DRILL-6947: fix RuntimeFilter memory leak

GitBox Wed, 23 Jan 2019 09:57:17 -0800

sohami commented on a change in pull request #1600: DRILL-6947: fix 
RuntimeFilter memory leak
URL: https://github.com/apache/drill/pull/1600#discussion_r250298481


 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/filter/RuntimeFilterRecordBatch.java
 ##########
 @@ -224,21 +224,39 @@ private void applyRuntimeFilter() throws 
SchemaChangeException {
     setupHashHelper();
     //To make each independent bloom filter work together to construct a final 
filter result: BitSet.
     BitSet bitSet = new BitSet(originalRecordCount);
-    for (int i = 0; i < toFilterFields.size(); i++) {
-      BloomFilter bloomFilter = bloomFilters.get(i);
-      String fieldName = toFilterFields.get(i);
-      computeBitSet(field2id.get(fieldName), bloomFilter, bitSet);
-    }
+
+    int filterSize = toFilterFields.size();
     int svIndex = 0;
-    for (int i = 0; i < originalRecordCount; i++) {
-      boolean contain = bitSet.get(i);
-      if (contain) {
-        sv2.setIndex(svIndex, i);
-        svIndex++;
-      } else {
-        filteredRows++;
+    if (filterSize == 1) {
+      BloomFilter bloomFilter = bloomFilters.get(0);
+      String fieldName = toFilterFields.get(0);
+      int fieldId = field2id.get(fieldName);
+      for (int rowIndex = 0; rowIndex < originalRecordCount; rowIndex++) {
+        long hash = hash64.hash64Code(rowIndex, 0, fieldId);
+        boolean contain = bloomFilter.find(hash);
+        if (contain) {
+          sv2.setIndex(svIndex, rowIndex);
+          svIndex ++;
+        }
+      }
+    } else {
+      for (int i = 0; i < toFilterFields.size(); i++) {
+        BloomFilter bloomFilter = bloomFilters.get(i);
+        String fieldName = toFilterFields.get(i);
+        computeBitSet(field2id.get(fieldName), bloomFilter, bitSet);
+      }
+      for (int i = 0; i < originalRecordCount; i++) {
+        boolean contain = bitSet.get(i);
+        if (contain) {
+          sv2.setIndex(svIndex, i);
+          svIndex++;
+        } else {
+          filteredRows++;
+        }
 
 Review comment:
   You can re-write this code like below which will have `O(originalRecordCount 
* filterSize)` complexity rather than `O(originalRecordCount * filterSize) + 
O(originalRecordCount)`
   
   ```
   final int filterSize = toFilterFields.size();
   for (int i=0; i<originalRecordCount; ++i) {
      boolean contain = true;
      for (int j=0; j<filterSize; ++j) {
         final BloomFilter fieldFilter = bloomFilters.get(j);
         final String fieldName = toFilterFields.get(j);
         final int fieldId = field2id.get(fieldName);
         long hash = hash64.hash64Code(i, 0, fieldId);
         contain = contain && fieldFilter.find(hash);
      }
      if (contain) {
         sv2.setIndex(svIndex++, i);
      } else {
         ++filteredRows;
      }
   }
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] sohami commented on a change in pull request #1600: DRILL-6947: fix RuntimeFilter memory leak

Reply via email to