Yuming Wang commented on SPARK-23405:

I think it's data skew, you should broadcast small table.

> The task will hang up when a small table left semi join a big table
> -------------------------------------------------------------------
>                 Key: SPARK-23405
>                 URL: https://issues.apache.org/jira/browse/SPARK-23405
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1
>            Reporter: KaiXinXIaoLei
>            Priority: Major
>         Attachments: SQL.png, taskhang up.png
> I run a sql: `select ls.cs_order_number from ls left semi join catalog_sales 
> cs on ls.cs_order_number = cs.cs_order_number`, The `ls` table is a small 
> table ,and the number is one. The `catalog_sales` table is a big table,  and 
> the number is 10 billion. The task will be hang up:
> !taskhang up.png!
>  And the sql page is :
> !SQL.png!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to