comphead commented on code in PR #2169: URL: https://github.com/apache/datafusion-comet/pull/2169#discussion_r2282630844
########## spark/src/main/scala/org/apache/comet/CometExecIterator.scala: ########## @@ -35,22 +35,66 @@ import org.apache.comet.Tracing.withTrace import org.apache.comet.vector.NativeUtil /** - * An iterator class used to execute Comet native query. It takes an input iterator which comes - * from Comet Scan and is expected to produce batches of Arrow Arrays. During consuming this - * iterator, it will consume input iterator and pass Arrow Arrays to Comet native engine by - * addresses. Even after the end of input iterator, this iterator still possibly continues - * executing native query as there might be blocking operators such as Sort, Aggregate. The API - * `hasNext` can be used to check if it is the end of this iterator (i.e. the native query is - * done). + * Comet's primary execution iterator that bridges JVM (Spark) and native (Rust) execution + * environments. This iterator orchestrates native query execution on Arrow columnar batches while + * managing sophisticated memory ownership semantics across the JNI boundary. * + * '''Architecture Overview:''' + * - \1. Consumes input ColumnarBatch iterators from Spark operators + * - 2. Transfers Arrow array ownership to native DataFusion execution engine via JNI (* see + * note below) + * - 3. Executes queries natively using DataFusion's columnar processing + * - 4. Returns results as ColumnarBatch with ownership transferred back to JVM + * + * * This isn't quite true. Comet does not currently implement best practice when passing batches Review Comment: ```suggestion * * This isn't quite true. ``` is it needed?? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org