[jira] [Commented] (HIVE-16513) width_bucket issues

Carter Shanklin (JIRA) Sun, 23 Apr 2017 18:29:32 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980618#comment-15980618
 ]


Carter Shanklin commented on HIVE-16513:
----------------------------------------

Another update, the case expression thing turns out to be another divide by 
zero thing that got swallowed up.

{code}
h exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 
3)) END, 10)
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 
3)) END, 10)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2154)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 
10)
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
        ... 13 more
Caused by: java.lang.ArithmeticException: / by zero
        at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFWidthBucket.evaluate(GenericUDFWidthBucket.java:75)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorHead._evaluate(ExprNodeEvaluatorHead.java:44)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
        at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
        at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
        ... 18 more
{code}

> width_bucket issues
> -------------------
>
>                 Key: HIVE-16513
>                 URL: https://issues.apache.org/jira/browse/HIVE-16513
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Carter Shanklin
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16513) width_bucket issues

Reply via email to