parthchandra commented on code in PR #2163: URL: https://github.com/apache/datafusion-comet/pull/2163#discussion_r2283491363
########## docs/source/contributor-guide/plugin_overview.md: ########## @@ -140,8 +140,31 @@ accessing Arrow data structures from multiple languages. [Arrow C Data Interface]: https://arrow.apache.org/docs/format/CDataInterface.html -- `CometExecIterator` invokes native plans and uses Arrow FFI to read the output batches -- Native `ScanExec` operators call `CometBatchIterator` via JNI to fetch input batches from the JVM +### Array Ownership and Lifecycle + +#### Importing Batches from Native to JVM + +`CometExecIterator` invokes native plans by calling the JNI function `executePlan`. The ownership of the output +batches, which are created in native code, is transferred to FFI ready to be consumed by Java once the `executePlan` +function returns. + +Once the JVM code finishes processing the arrays, it will call the `release` callback, which invokes native code +to release the memory that was allocated on the native side. + +#### Exporting Batches from JVM to Native + +The leaf nodes of native plans are often `ScanExec` operators, which call `CometBatchIterator` via JNI to fetch +input batches from the JVM. + +Note that this approach does not follow best practice and does not benefit from zero-copy transfer. + +The incoming array data is owned by the JVM and can be freed or reused on the JVM side during the next call to Review Comment: Correcting myself. The JVM calls `org.apache.comet.parquet.Native.setPageV1` to pass the memory block to native. To be clear though, this is not a problem. The memory block here is a Java byte array (or Java ByteBuffer) and the ownership can only belong to the JVM. Also, this buffer is quickly consumed on the native side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org