Thanks for reaching out.

Please file a ticket about the error with stack traces and all
necessary information (snippets from cagalogd logs) to further debug this.

> I tried to apply a workaround by setting metadata_loader_parallelism=1
via ALTER TABLE
We don't have such a table property. Was it suggested by Gemini?

> Question: Is there a way to globally set metadata_loader_parallelism=1
for the entire Catalogd, or perhaps a flag to force Impala to ignore block
location errors for object storage during the initial load?
I suspect a different issue than a concurrency bug, but FWIW Catalogd has
backend flags "max_hdfs_partitions_parallel_load" and
"max_nonhdfs_partitions_parallel_load" which you can set:
https://github.com/apache/impala/blob/master/be/src/catalog/catalog.cc#L39-L45
This affects file metadata loading of non-Iceberg tables as well, but it
can be useful to validate the bug.

We improved Iceberg table loading since 4.5, and fixed a few issues, so
there's a chance that the issue you ran into is already fixed.

Would it be possible for you to try out the current Impala master?
I think it is time for us to do another upstream release, but it would be
nice if we can make sure your use case will work well with the newest
release.

Cheers,
   Zoltan


On Mon, Jan 5, 2026 at 1:35 PM 汲广熙 <[email protected]> wrote:

> Subject:&nbsp;Re: [Bug?] Impala 4.5 Iceberg table loading failure on
> Tencent Cloud COS (cosn://)
>
> Hi,
>
> Thank you for the suggestions. I have some critical updates regarding the
> TableLoadingException: failed to load 1 paths&nbsp;issue on Tencent Cloud
> COS (cosn://).
>
> I have already enabled Local Catalog
> Mode&nbsp;(--use_local_catalog=true&nbsp;and minimal&nbsp;topic mode), but
> the issue persists.
>
> Based on an analysis provided by Gemini (Google's AI model)&nbsp;regarding
> the stack trace, it was suggested that the error in
> ParallelFileMetadataLoader.loadInternal&nbsp;might be caused by concurrency
> issues when handling object storage paths. Following this logic, I tried to
> apply a workaround by setting metadata_loader_parallelism=1&nbsp;via ALTER
> TABLE.
>
> However, the command failed with the same AnalysisException&nbsp;and
> TableLoadingException. It seems Impala falls into a deadlock state: I
> cannot modify the table properties to fix the loading logic because Impala
> fails to load the table metadata even during the ALTER TABLE&nbsp;analysis
> phase.
>
> Observations:
>
>
> The error originates from ParallelFileMetadataLoader.loadInternal.
>
>
>
> Even with Local Catalog enabled, the loader seems unable to handle the
> missing block location info from the cosn://&nbsp;driver.
>
>
>
> Standard Parquet tables work fine; this appears specific to the
> Iceberg&nbsp;metadata loading path in Impala 4.5.
>
>
> Question: Is there a way to globally set
> metadata_loader_parallelism=1&nbsp;for the entire Catalogd, or perhaps a
> flag to force Impala to ignore block location errors for object storage
> during the initial load?
>
> I am happy to provide a full stack trace or open a JIRA ticket, as this
> currently makes Iceberg on COS unusable in our environment.
>
>
>
>
>
> ============================================================================================================
> When I run a simple SELECT *or INSERT INTOquery on this table, I get the
> following error:
> AnalysisException: Failed to load metadata for table:
> 'iceberg_cos_employee_test'
> ...
> IcebergTableLoadingException: Error loading metadata for Iceberg table
> cosn://cdnlogtest-1252412955/impala_test_db/iceberg_cos_employee_test
> ...
> Loading file and block metadata for 1 paths for table ...: failed to load
> 1 paths.
>
> ============================================================================================================
>
>
> Best regards,
>          原始邮件
>
>
> 发件人:Zoltán Borók-Nagy <[email protected]&gt;
> 发件时间:2025年12月23日 23:52
> 收件人:dev <[email protected]&gt;
> 主题:Re: Advice and Considerations for Building an Impala
> Compute-StorageSeparated Architecture
>
>
>
>        Hi,
>
>
> Just&nbsp;a&nbsp;few&nbsp;tips&nbsp;off&nbsp;the&nbsp;top&nbsp;of&nbsp;my&nbsp;head:
>
> &nbsp;-&nbsp;use&nbsp;dedicated&nbsp;coordinators&nbsp;and&nbsp;executors,&nbsp;rule&nbsp;of&nbsp;thumb&nbsp;for
>
> coordinator:executor&nbsp;ratio&nbsp;is&nbsp;1:50.&nbsp;Though&nbsp;for&nbsp;HA&nbsp;you&nbsp;probably&nbsp;want&nbsp;&gt;1
> coordinators.
>
> &nbsp;-&nbsp;use&nbsp;local&nbsp;catalog&nbsp;mode&nbsp;(aka&nbsp;On-demand&nbsp;metadata):
> https://impala.apache.org/docs/build/html/topics/impala_metadata.html
>
> &nbsp;-&nbsp;enabling&nbsp;remote&nbsp;data&nbsp;cache&nbsp;(with&nbsp;SSD&nbsp;disks)&nbsp;is&nbsp;essential&nbsp;in
> compute-storage&nbsp;separated&nbsp;setup:
> https://impala.apache.org/docs/build/html/topics/impala_data_cache.html
>
>
> What&nbsp;table&nbsp;format&nbsp;/&nbsp;file&nbsp;format&nbsp;are&nbsp;you&nbsp;planning&nbsp;to&nbsp;use?
>
> If&nbsp;table&nbsp;format&nbsp;is&nbsp;Iceberg,&nbsp;make&nbsp;sure&nbsp;you&nbsp;use&nbsp;the&nbsp;latest&nbsp;Impala&nbsp;as&nbsp;we
>
> continuously&nbsp;improve&nbsp;Impala's&nbsp;performance&nbsp;on&nbsp;Iceberg.
>
> File&nbsp;format:&nbsp;Impala&nbsp;most&nbsp;efficiently&nbsp;works&nbsp;on&nbsp;Parquet&nbsp;files.
>
> Avoid&nbsp;small&nbsp;file issues:
>
> &nbsp;-&nbsp;choose&nbsp;proper&nbsp;partitioning&nbsp;for&nbsp;your&nbsp;data,&nbsp;i.e.&nbsp;avoid&nbsp;too&nbsp;coarse-grained
>
> and&nbsp;too&nbsp;fine-grained&nbsp;partitioning.&nbsp;I.e.&nbsp;you&nbsp;probably&nbsp;want&nbsp;more&nbsp;than&nbsp;200&nbsp;MB
>
> data&nbsp;per&nbsp;partition,&nbsp;but&nbsp;probably&nbsp;less&nbsp;than&nbsp;20&nbsp;GB.
>
> &nbsp;-&nbsp;compact&nbsp;your&nbsp;tables&nbsp;regularly,&nbsp;for&nbsp;Iceberg&nbsp;tables&nbsp;Impala&nbsp;has&nbsp;the
> OPTIMIZE&nbsp;statement:
> https://impala.apache.org/docs/build/html/topics/impala_iceberg.html
>
> I&nbsp;hope&nbsp;others&nbsp;chime&nbsp;in&nbsp;as&nbsp;well.
>
>
> We&nbsp;would&nbsp;love&nbsp;to&nbsp;hear&nbsp;back&nbsp;about&nbsp;your&nbsp;experiences,&nbsp;and&nbsp;feel&nbsp;free&nbsp;to&nbsp;open
>
> tickets&nbsp;for&nbsp;Impala&nbsp;if&nbsp;you&nbsp;run&nbsp;into&nbsp;any&nbsp;issue:
> https://issues.apache.org/jira/projects/IMPALA/issues
>
> Cheers,
> &nbsp;&nbsp;&nbsp;&nbsp;Zoltan
>
>
> On&nbsp;Tue,&nbsp;Dec&nbsp;23,&nbsp;2025&nbsp;at&nbsp;8:23 AM&nbsp;汲广熙&nbsp;<
> [email protected]&gt;&nbsp;wrote:
>
> &gt;&nbsp;Dear&nbsp;Impala&nbsp;Team,
> &gt;
> &gt;&nbsp;I&nbsp;hope&nbsp;this&nbsp;message&nbsp;finds&nbsp;you&nbsp;well.
> &gt;
>
> &gt;&nbsp;I&nbsp;am&nbsp;currently&nbsp;planning&nbsp;to&nbsp;build&nbsp;a&nbsp;compute-storage&nbsp;separated&nbsp;architecture​
>
> &gt;&nbsp;based&nbsp;on&nbsp;Apache&nbsp;Impala.&nbsp;In&nbsp;this&nbsp;setup:
> &gt;
> &gt;
>
> &gt;&nbsp;Compute&nbsp;layer:&nbsp;Apache&nbsp;Impala&nbsp;will&nbsp;be&nbsp;used&nbsp;for&nbsp;SQL&nbsp;query&nbsp;execution.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Storage&nbsp;layer:&nbsp;Tencent&nbsp;Cloud&nbsp;Object&nbsp;Storage&nbsp;(COS)&nbsp;will&nbsp;serve&nbsp;as&nbsp;the
> &gt;&nbsp;backend&nbsp;storage.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Data&nbsp;ingestion:&nbsp;Kafka&nbsp;will&nbsp;be&nbsp;used&nbsp;to&nbsp;stream&nbsp;data&nbsp;into&nbsp;the&nbsp;system.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Monitoring&nbsp;&amp;amp;&nbsp;visualization:&nbsp;Grafana&nbsp;will&nbsp;be&nbsp;used&nbsp;to&nbsp;display
> &gt;&nbsp;operational&nbsp;and&nbsp;performance&nbsp;metrics.
> &gt;
> &gt;
>
> &gt;&nbsp;Could&nbsp;you&nbsp;please&nbsp;provide&nbsp;some&nbsp;recommendations&nbsp;and&nbsp;key&nbsp;considerations​&nbsp;for
>
> &gt;&nbsp;such&nbsp;an&nbsp;architecture?&nbsp;Specifically,&nbsp;I&nbsp;would&nbsp;appreciate&nbsp;guidance&nbsp;on:
> &gt;
> &gt;
>
> &gt;&nbsp;Best&nbsp;practices&nbsp;for&nbsp;integrating&nbsp;Impala&nbsp;with&nbsp;cloud&nbsp;object&nbsp;storage&nbsp;(COS).
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Performance&nbsp;tuning&nbsp;tips&nbsp;for&nbsp;Impala&nbsp;in&nbsp;a&nbsp;disaggregated&nbsp;environment.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Any&nbsp;known&nbsp;limitations&nbsp;or&nbsp;compatibility&nbsp;issues&nbsp;when&nbsp;using&nbsp;COS&nbsp;as&nbsp;storage.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Recommended&nbsp;configurations&nbsp;for&nbsp;Kafka-to-Impala&nbsp;data&nbsp;pipelines.
> &gt;
> &gt;
> &gt;
>
> &gt;&nbsp;Monitoring&nbsp;strategies&nbsp;for&nbsp;tracking&nbsp;query&nbsp;performance&nbsp;and&nbsp;resource&nbsp;usage&nbsp;in
> &gt;&nbsp;Grafana.
> &gt;
> &gt;
>
> &gt;&nbsp;Thank&nbsp;you&nbsp;very&nbsp;much&nbsp;for&nbsp;your&nbsp;support&nbsp;and&nbsp;advice.&nbsp;I&nbsp;look&nbsp;forward&nbsp;to&nbsp;your
> &gt;&nbsp;reply.
> &gt;
> &gt;&nbsp;Best&nbsp;regards

Reply via email to