[ 
https://issues.apache.org/jira/browse/IMPALA-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231143#comment-17231143
 ] 

Tim Armstrong commented on IMPALA-10317:
----------------------------------------

[~amansinha] we've generally preferred limits based on runtime information to 
avoid false positives/negatives. I think here a runtime limit is probably 
important to handle the case where a many:many exploding join is incorrectly 
detected to be a 1:many pk/fk join. We could also potentially add a planner 
check but 

I think it's a pretty complex discussion and the current state of things in 
Impala is pretty heavily a product of my own biases, my understanding is that 
other systems have used estimates more heavily in resource limits and I guess 
it works out for them.

> Add query option that limits join #rows at runtime
> --------------------------------------------------
>
>                 Key: IMPALA-10317
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10317
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Fucun Chu
>            Assignee: Fucun Chu
>            Priority: Major
>         Attachments: query82_summary.png
>
>
> Reject queries that rows produced too bigger by join operator when executing 
> the query.
> This is a mechanism to protect the cluster from potentially harmful queries.
> When the cardinality of the table is very large and the join conditions are 
> very bad, the number of rows produced by the join will be very large, 
> sometimes tens of billions, which affects the cluster status and other 
> running queries.
> In our environment, the NUM_JOIN_ROWS_PRODUCED_LIMIT query option is added to 
> limit the number of rows produced by a single join operator.
> Implementation refers to 
> [IMPALA-6034|https://issues.apache.org/jira/browse/IMPALA-6034] and summary 
> (see the figure below), check the join operator #rows size



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to