[
https://issues.apache.org/jira/browse/IMPALA-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18067685#comment-18067685
]
ASF subversion and git services commented on IMPALA-11402:
----------------------------------------------------------
Commit 20220fb9232b94d228383fe693a383d2c71a4733 in impala's branch
refs/heads/master from Mihaly Szjatinya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=20220fb92 ]
IMPALA-14583: Support partial RPC dispatch for Iceberg tables
This patch extends IMPALA-11402 to support partial RPC dispatch for
Iceberg tables in local catalog mode. IMPALA-11402 added support for
HDFS partitioned tables where catalogd can truncate the response of
getPartialCatalogObject at partition boundaries when the file count
exceeds catalog_partial_fetch_max_files.
For Iceberg tables, the file list is not organized by partitions but
stored as a flat list of data and delete files. This patch implements
offset-based pagination to allow catalogd to truncate the response at
any point in the file list, not just at partition boundaries.
Implementation details:
- Added iceberg_file_offset field to TTableInfoSelector thrift struct
- IcebergContentFileStore.toThriftPartial() supports pagination with
offset and limit parameters
- IcebergContentFileStore uses a reverse lookup table
(icebergFileOffsetToContentFile_) for efficient offset-based access to
files
- IcebergTable.getPartialInfo() enforces the file limit configured by
catalog_partial_fetch_max_files (reusing the flag from IMPALA-11402)
- CatalogdMetaProvider.loadIcebergTableWithRetry() implements the retry
loop on the coordinator side, sending follow-up requests with
incremented offsets until all files are fetched
- Coordinator detects catalog version changes between requests and
throws InconsistentMetadataFetchException for query replanning
Key differences from IMPALA-11402:
- Offset-based pagination instead of partition-based (can split
anywhere)
- Single flat file list instead of per-partition file lists
- Works with both data files and delete files (Iceberg v2)
Tests:
- Added two custom-cluster tests in TestAllowIncompleteData:
* test_incomplete_iceberg_file_list: 150 data files with limit=100
* test_iceberg_with_delete_files: 60+ data+delete files with limit=50
- Both tests verify partial fetch across multiple requests and proper
log messages for truncation warnings and request counts
Change-Id: I7f2c058b7cc8efc15bac9fe0e91baadbb7b92cbb
Reviewed-on: http://gerrit.cloudera.org:8080/24041
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> getPartialCatalogObject fails with OOM with huge number of files
> ----------------------------------------------------------------
>
> Key: IMPALA-11402
> URL: https://issues.apache.org/jira/browse/IMPALA-11402
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Fix For: Impala 5.0.0
>
>
> The response size of getPartialCatalogObject depends on the number of
> partitions in the request. Even with the optimization of IMPALA-7501, the
> response size could still exceeds the 2GB byte array limit if requesting all
> partitions of a huge table. E.g.
> {noformat}
> I0224 02:30:32.183627 28707 jni-util.cc:321] java.lang.OutOfMemoryError
> at
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
> at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
> at
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
> at
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:197)
> at
> org.apache.thrift.protocol.TBinaryProtocol.writeBinary(TBinaryProtocol.java:236)
> at
> org.apache.impala.thrift.THdfsFileDesc$THdfsFileDescStandardScheme.write(THdfsFileDesc.java:450)
> at
> org.apache.impala.thrift.THdfsFileDesc$THdfsFileDescStandardScheme.write(THdfsFileDesc.java:405)
> at org.apache.impala.thrift.THdfsFileDesc.write(THdfsFileDesc.java:346)
> at
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:1647)
> at
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:1433)
> at
> org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:1265)
> at
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:1402)
> at
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:1215)
> at
> org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:1061)
> at
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:1157)
> at
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:1010)
> at
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:876)
> at org.apache.thrift.TSerializer.serialize(TSerializer.java:84)
> at
> org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:91)
> at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
> at
> org.apache.impala.service.JniCatalogOp.execAndSerializeSilentStartAndFinish(JniCatalogOp.java:109)
> at
> org.apache.impala.service.JniCatalog.execAndSerializeSilentStartAndFinish(JniCatalog.java:259)
> at
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:436){noformat}
> We should add flag to limit the number of partitions in a single
> getPartialCatalogObject request. When more partitions are required, fetch
> them in different batches.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]