zhztheplayer commented on code in PR #6797:
URL: https://github.com/apache/incubator-gluten/pull/6797#discussion_r1716404930


##########
gluten-data/src/main/java/org/apache/gluten/columnarbatch/ColumnarBatches.java:
##########
@@ -186,23 +185,14 @@ private static ColumnarBatch load(BufferAllocator 
allocator, ColumnarBatch input
       ColumnarBatch output =
           ArrowAbiUtil.importToSparkColumnarBatch(allocator, arrowSchema, 
cArray);
 
-      // Follow gluten input's reference count. This might be optimized using
-      // automatic clean-up or once the extensibility of ColumnarBatch is 
enriched
+      // Loaded Arrow ColumnarBatch lifecycle is controlled by the caller. The 
GC can help clean it.
       IndicatorVector giv = (IndicatorVector) input.column(0);
-      ImplicitClass.ArrowColumnarBatchRetainer retainer =
-          new ImplicitClass.ArrowColumnarBatchRetainer(output);
-      for (long i = 0; i < (giv.refCnt() - 1); i++) {
-        retainer.retain();
-      }
-
       // close the input one
       for (long i = 0; i < giv.refCnt(); i++) {
         input.close();
       }
 
-      // populate new vectors to input
-      transferVectors(output, input);
-      return input;
+      return output;

Review Comment:
   Thank you for writing the test and I finally got what happened here.
   
   `transferVectors` was used for a reason. Once `ensureLoaded` is called, we 
should prevent caller code from using the input batch again after the method 
returns to do further operations (e.g., load it again, etc.) so the code 
behavior will be more predictable . So it's likely better not to remove it in 
the patch.
   
   The root cause of the issue is actually we didn't update the column vectors 
used by `ColumnarBatchRow` in `transferVectors`. A proper way is to update them 
as well as `ColumnarBatch#columns`.
   
   And I push a commit to the PR to untangle test code, please pull that one 
before next push. Thanks.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to