[ 
https://issues.apache.org/jira/browse/IMPALA-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balazs Jeszenszky updated IMPALA-7528:
--------------------------------------
    Description: 
The following:

{code:java}
| F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                     |
| Per-Host Resources: mem-estimate=33.94MB mem-reservation=1.94MB    |
| 02:HASH JOIN [INNER JOIN, BROADCAST]                               |
| |  hash predicates: b.code = a.code                                |
| |  fk/pk conjuncts: none                                           |
| |  runtime filters: RF000 <- a.code                                |
| |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB |
| |  tuple-ids=1,0 row-size=163B cardinality=9223372036854775807     |
| |                                                                  |
| |--03:EXCHANGE [BROADCAST]                                         |
| |  |  mem-estimate=0B mem-reservation=0B                           |
| |  |  tuple-ids=0 row-size=82B cardinality=823                     |
| |  |                                                               |
| |  F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                  |
| |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B     |
| |  00:SCAN HDFS [default.sample_07 a, RANDOM]                      |
| |     partitions=1/1 files=1 size=44.98KB                          |
| |     stats-rows=823 extrapolated-rows=disabled                    |
| |     table stats: rows=823 size=44.98KB                           |
| |     column stats: all                                            |
| |     mem-estimate=32.00MB mem-reservation=0B                      |
| |     tuple-ids=0 row-size=82B cardinality=823                     |
| |                                                                  |
| 01:SCAN HDFS [default.sample_08 b, RANDOM]                         |
|    partitions=1/1 files=1 size=44.99KB                             |
|    runtime filters: RF000 -> b.code                                |
|    stats-rows=823 extrapolated-rows=disabled                       |
|    table stats: rows=823 size=44.99KB                              |
|    column stats: all                                               |
|    mem-estimate=32.00MB mem-reservation=0B                         |
|    tuple-ids=1 row-size=82B cardinality=823                        |
+--------------------------------------------------------------------+
{code}

is the result of both join columns having 0 as NDV.
https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/planner/JoinNode.java#L368
should handle this more gracefully.

IMPALA-7310 makes it a bit more likely that someone will run into this. 

  was:
The following:

{code:java}
| F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                     |
| Per-Host Resources: mem-estimate=33.94MB mem-reservation=1.94MB    |
| 02:HASH JOIN [INNER JOIN, BROADCAST]                               |
| |  hash predicates: b.code = a.code                                |
| |  fk/pk conjuncts: none                                           |
| |  runtime filters: RF000 <- a.code                                |
| |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB |
| |  tuple-ids=1,0 row-size=163B cardinality=9223372036854775807     |
| |                                                                  |
| |--03:EXCHANGE [BROADCAST]                                         |
| |  |  mem-estimate=0B mem-reservation=0B                           |
| |  |  tuple-ids=0 row-size=82B cardinality=823                     |
| |  |                                                               |
| |  F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                  |
| |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B     |
| |  00:SCAN HDFS [default.sample_07 a, RANDOM]                      |
| |     partitions=1/1 files=1 size=44.98KB                          |
| |     stats-rows=823 extrapolated-rows=disabled                    |
| |     table stats: rows=823 size=44.98KB                           |
| |     column stats: all                                            |
| |     mem-estimate=32.00MB mem-reservation=0B                      |
| |     tuple-ids=0 row-size=82B cardinality=823                     |
| |                                                                  |
| 01:SCAN HDFS [default.sample_08 b, RANDOM]                         |
|    partitions=1/1 files=1 size=44.99KB                             |
|    runtime filters: RF000 -> b.code                                |
|    stats-rows=823 extrapolated-rows=disabled                       |
|    table stats: rows=823 size=44.99KB                              |
|    column stats: all                                               |
|    mem-estimate=32.00MB mem-reservation=0B                         |
|    tuple-ids=1 row-size=82B cardinality=823                        |
+--------------------------------------------------------------------+
{code}

is the result of both join columns having 0 as NDV.
https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/planner/JoinNode.java#L368
should handle a potential division by zero.

IMPALA-7310 makes it a bit more likely that someone will run into this. 


> Division by zero when computing cardinalities of many to many joins on NULL 
> columns
> -----------------------------------------------------------------------------------
>
>                 Key: IMPALA-7528
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7528
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.12.0
>            Reporter: Balazs Jeszenszky
>            Priority: Major
>
> The following:
> {code:java}
> | F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                     |
> | Per-Host Resources: mem-estimate=33.94MB mem-reservation=1.94MB    |
> | 02:HASH JOIN [INNER JOIN, BROADCAST]                               |
> | |  hash predicates: b.code = a.code                                |
> | |  fk/pk conjuncts: none                                           |
> | |  runtime filters: RF000 <- a.code                                |
> | |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB |
> | |  tuple-ids=1,0 row-size=163B cardinality=9223372036854775807     |
> | |                                                                  |
> | |--03:EXCHANGE [BROADCAST]                                         |
> | |  |  mem-estimate=0B mem-reservation=0B                           |
> | |  |  tuple-ids=0 row-size=82B cardinality=823                     |
> | |  |                                                               |
> | |  F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                  |
> | |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B     |
> | |  00:SCAN HDFS [default.sample_07 a, RANDOM]                      |
> | |     partitions=1/1 files=1 size=44.98KB                          |
> | |     stats-rows=823 extrapolated-rows=disabled                    |
> | |     table stats: rows=823 size=44.98KB                           |
> | |     column stats: all                                            |
> | |     mem-estimate=32.00MB mem-reservation=0B                      |
> | |     tuple-ids=0 row-size=82B cardinality=823                     |
> | |                                                                  |
> | 01:SCAN HDFS [default.sample_08 b, RANDOM]                         |
> |    partitions=1/1 files=1 size=44.99KB                             |
> |    runtime filters: RF000 -> b.code                                |
> |    stats-rows=823 extrapolated-rows=disabled                       |
> |    table stats: rows=823 size=44.99KB                              |
> |    column stats: all                                               |
> |    mem-estimate=32.00MB mem-reservation=0B                         |
> |    tuple-ids=1 row-size=82B cardinality=823                        |
> +--------------------------------------------------------------------+
> {code}
> is the result of both join columns having 0 as NDV.
> https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/planner/JoinNode.java#L368
> should handle this more gracefully.
> IMPALA-7310 makes it a bit more likely that someone will run into this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to