ji chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/23613 )
Change subject: IMPALA-14092 Part2: Support querying of paimon data table via JNI ...................................................................... Patch Set 12: (7 comments) http://gerrit.cloudera.org:8080/#/c/23613/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23613/11//COMMIT_MSG@23 PS11, Line 23: . > nit: missing space after. Done http://gerrit.cloudera.org:8080/#/c/23613/11//COMMIT_MSG@35 PS11, Line 35: And PaimonJniScanner will pass the arrow offheap : record batch memory pointer to the BE backend. > Can you elaborate a bit more about the lifetime of this arrow recordbatch? Done??the batch size can be adjusted, will implmenent in the next revision. http://gerrit.cloudera.org:8080/#/c/23613/11/be/src/exec/paimon/paimon-jni-scan-node.h File be/src/exec/paimon/paimon-jni-scan-node.h: http://gerrit.cloudera.org:8080/#/c/23613/11/be/src/exec/paimon/paimon-jni-scan-node.h@51 PS11, Line 51: /// 2. Backend: Creates an PaimonJniScanner object on the Java heap. > Will there be 1 PaimonJniScanner per scan fragment instance? yes, 1 PaimonJniScanner per scan fragment instance? http://gerrit.cloudera.org:8080/#/c/23613/11/common/thrift/Types.thrift File common/thrift/Types.thrift: http://gerrit.cloudera.org:8080/#/c/23613/11/common/thrift/Types.thrift@80 PS11, Line 80: Iceberg and Pa > nit: Iceberg and Paimon Done http://gerrit.cloudera.org:8080/#/c/23613/11/fe/src/main/java/org/apache/impala/planner/PaimonScanNode.java File fe/src/main/java/org/apache/impala/planner/PaimonScanNode.java: http://gerrit.cloudera.org:8080/#/c/23613/11/fe/src/main/java/org/apache/impala/planner/PaimonScanNode.java@242 PS11, Line 242: numNodes_ = Math.max(totalNodes, 1); : numInstances_ = Math.max(totalInstances, 1); : } : : @Override : public void computeNodeResourceProfile(TQueryOptions queryOptions) { : // current batch size is from query options, so estimated bytes : > Can you explain how this is calculated? Is this follow some existing implem ?. current calculation is similiar with memoryEstimateForFetchingColumns in HbaseScanNode, will sum up the bytes consumed for each used column to get the average row size , since batch size is 1024, so need to multiply by 1024. there are no concret foluma for arrow batch, so initially use this formula. 2. the avgRowsize_ is calculated by function estimateAvgRowSize, the min value is PAIMON_ROW_AVG_SIZE_OVERHEAD, so it is always positive value. 3. sure , will implement this in the next revision. http://gerrit.cloudera.org:8080/#/c/23613/11/fe/src/main/java/org/apache/impala/util/paimon/PaimonJniScanner.java File fe/src/main/java/org/apache/impala/util/paimon/PaimonJniScanner.java: http://gerrit.cloudera.org:8080/#/c/23613/11/fe/src/main/java/org/apache/impala/util/paimon/PaimonJniScanner.java@99 PS11, Line 99: lits_.add(SerializationUtils.deseria > Create a constant for this values and put comment what the constant is abou will introduce param for upper limit of allocator. 1st argument will be removed in the next revision. http://gerrit.cloudera.org:8080/#/c/23613/11/fe/src/main/java/org/apache/impala/util/paimon/PaimonJniScanner.java@114 PS11, Line 114: // get mem limit : allocator_mem_limit_ = paimonJniScanParam > A method docs/comment is helpful here, because this is the main part of rea Done Below is a sample for memory limit checking: [localhost:21050] default> select * from functional_parquet.paimon_partitioned; Query: select * from functional_parquet.paimon_partitioned Query submitted at: 2025-12-02 23:41:30 (Coordinator: http://lisa:25000) Query state can be monitored at: http://lisa:25000/query_plan?query_id=97435735f16f20d4:383aa2a600000000 2025-12-02 23:41:30 [Exception] ERROR: Query 97435735f16f20d4:383aa2a600000000 failed: Memory limit exceeded -- To view, visit http://gerrit.cloudera.org:8080/23613 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie679a89a8cc21d52b583422336b9f747bdf37384 Gerrit-Change-Number: 23613 Gerrit-PatchSet: 12 Gerrit-Owner: ji chen <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: ji chen <[email protected]> Gerrit-Comment-Date: Tue, 02 Dec 2025 15:47:40 +0000 Gerrit-HasComments: Yes
