[GitHub] [arrow] zhztheplayer commented on a change in pull request #10883: ARROW-7272: [C++][Java] JNI bridge between RecordBatch and VectorSchemaRoot

GitBox Thu, 02 Dec 2021 22:58:50 -0800


zhztheplayer commented on a change in pull request #10883:
URL: https://github.com/apache/arrow/pull/10883#discussion_r761687978




##########
File path: 
java/dataset/src/main/java/org/apache/arrow/dataset/jni/NativeScanTask.java
##########
@@ -35,7 +36,7 @@ public NativeScanTask(NativeScanner scanner) {
   }
 
   @Override
-  public BatchIterator execute() {
+  public ArrowReader execute() {

Review comment:
       I think it's kind of unfair to compare the performance between a 
`BatchIterator` and a `ArrowReader`. Because ArrowReader loads the buffers to a 
schema-aware VectorSchemaRoot. While the old code left that burden to caller. 
As a result it the code should be slower than old code (for example, in my 
local env, 642ms vs 379ms, 350MB parquet data). But the overall impact should 
be small in real project.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] zhztheplayer commented on a change in pull request #10883: ARROW-7272: [C++][Java] JNI bridge between RecordBatch and VectorSchemaRoot

Reply via email to