pengdou1990 created IMPALA-11473:
------------------------------------

             Summary: Data skew occurs when execute HDFS table insert operation
                 Key: IMPALA-11473
                 URL: https://issues.apache.org/jira/browse/IMPALA-11473
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend, Frontend
    Affects Versions: Impala 3.2.0
            Reporter: pengdou1990
            Assignee: pengdou1990


When a large amount of data is inserted into a non-partitioned HDFS table, if 
the fragment instances of SELECT STATEMENT differ greatly in output rows, some 
fragment instances may very slow, while others may much faster. This causes the 
entire insert operation to take a long time to run.

For example, when I execute the following Insert SQL in a 10 nodes IMPALA 3.2 
cluster, the duration was almost 67 min.
{code:java}
create table user_behavior_export_2022070111 stored as parquet as SELECT * from 
user_behavior_table where time between '2022-07-01 11:00:00.000' and 
'2022-07-01 11:59:59.999'; {code}
Tips: user_behavior_table is an day partitioned table with 1000 columns.
 
{code:java}
Operator          #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail
-----------------------------------------------------------------------------------------------------------------------------
F00:HDFS WRITER       10     21m50s       1h6m                       346.88 MB  
      1.00 GB
00:SCAN HDFS          10    3s401ms    3s217ms  137.49M      94.49M   12.61 GB  
      2.75 GB  dafault.user_behavior_table {code}
The fastest HdfsTableSink executed 2 minutes, while the slowest HdfsTableSink 
executed 66 minutes. Different HdfsTableSink execution times vary greatly due 
to the amount of data inserted. For Each HDFS scanner, the amount of ReadBytes 
and ReadRows are not significantly skewed, but the amount of data returned 
makes a performance difference of tens of times.
{code:java}
HdfsTableSink:(Total: 1h6m, non-child: 1h6m, % non-child: 100.00%)
HdfsTableSink:(Total: 34m44s, non-child: 34m44s, % non-child: 100.00%)
HdfsTableSink:(Total: 25m20s, non-child: 25m20s, % non-child: 100.00%)
HdfsTableSink:(Total: 20m36s, non-child: 20m36s, % non-child: 100.00%)
HdfsTableSink:(Total: 25m46s, non-child: 25m46s, % non-child: 100.00%)
HdfsTableSink:(Total: 12m56s, non-child: 12m56s, % non-child: 100.00%)
HdfsTableSink:(Total: 10m56s, non-child: 10m56s, % non-child: 100.00%)
HdfsTableSink:(Total: 10m43s, non-child: 10m43s, % non-child: 100.00%)
HdfsTableSink:(Total: 4m7s, non-child: 4m7s, % non-child: 100.00%)
HdfsTableSink:(Total: 6m19s, non-child: 6m19s, % non-child: 100.00%)

- RowsInserted: 44.49M (44487661)
- RowsInserted: 21.42M (21417405)
- RowsInserted: 18.34M (18340886)
- RowsInserted: 14.42M (14422287)
- RowsInserted: 14.13M (14128973)
- RowsInserted: 8.96M (8955664)
- RowsInserted: 7.08M (7084182)
- RowsInserted: 6.97M (6969609)
- RowsInserted: 2.45M (2448096)
- RowsInserted: 2.84M (2837459)

- BytesRead: 2.32 GB (2485981168)
- BytesRead: 2.22 GB (2385372978)
- BytesRead: 2.66 GB (2858818300)
- BytesRead: 2.34 GB (2509528520)
- BytesRead: 2.71 GB (2906316201)
- BytesRead: 2.65 GB (2844324158)
- BytesRead: 2.36 GB (2531865904)
- BytesRead: 2.24 GB (2402854679)
- BytesRead: 2.78 GB (2979667784)
- BytesRead: 2.66 GB (2856271181)

- RowsRead: 101.56M (101561956)
- RowsRead: 82.91M (82914833)
- RowsRead: 97.30M (97297545)
- RowsRead: 71.74M (71736869)
- RowsRead: 100.67M (100669817)
- RowsRead: 101.85M (101851844)
- RowsRead: 83.43M (83432722)
- RowsRead: 90.33M (90331924)
- RowsRead: 93.34M (93341164)
- RowsRead: 92.51M (92509166)

- RowsReturned: 13.75M (13748752)
- RowsReturned: 44.49M (44487661)
- RowsReturned: 21.42M (21417405)
- RowsReturned: 18.34M (18340886)
- RowsReturned: 14.42M (14422287)
- RowsReturned: 14.13M (14128973)
- RowsReturned: 8.96M (8955664)
- RowsReturned: 7.08M (7084182)
- RowsReturned: 6.97M (6969609)
- RowsReturned: 2.45M (2448096)
- RowsReturned: 2.84M (2837459) 

{code}
I've tried all insert hints on IMPALA 3.2 and IMPALA 4.0, none of them solve 
this kind of data skew problem.
 
It seems that we should perform some kind of shuffle to ensure that the amount 
of rows inserted on each nodes are not significantly skewed, for exampling 
random shuffle or round-robin shuffle.
 
Do you have any other solutions?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to