Re:Reading Iceberg Tables in Batch mode performance

Xuyang Wed, 06 Mar 2024 17:58:48 -0800

Hi, can you provide more details about this Flink batch job? For instance, 
through a flame graph, the threads are found spending most of their time on 
some certain tasks.





--

    Best！
    Xuyang




At 2024-03-07 08:40:32, "Charles Tan" <ctangu...@gmail.com> wrote:

Hi all,


I have been looking into using Flink in batch mode to process Iceberg tables. I 
noticed that the performance for queries in Flink's batch mode is quite slow, 
especially when compared to Spark. I'm wondering if there are any 
configurations that I'm missing to get better performance out of reading from 
Iceberg.


In the Flink SQL shell, I ran the following:
1. SET execution.runtime-mode = batch;

2. SELECT COUNT(*) FROM t;


The table t has about 2mb of data and this query took 24 seconds for Flink to 
run. This is compared to the 2.4 seconds it took for Spark to execute the same 
query.


Flink version 1.17.1
Spark version 3.5.1


Any insights or suggestions would be appreciated.


Thanks,
Charles

Re:Reading Iceberg Tables in Batch mode performance

Reply via email to