[ 
https://issues.apache.org/jira/browse/HIVE-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-9755:
--------------------------------
    Attachment: HIVE-9755.patch

The merge() method during the reduce phase of the ngram UDAF should be a NO-OP 
when the mapper returns an empty set. The value of ZERO returned in the list 
(one and only one item) is an indicator that the iterate() method was never 
called in that map job. So returning from merge() with no action.

> Hive built-in "ngram" UDAF fails when a mapper has no matches.
> --------------------------------------------------------------
>
>                 Key: HIVE-9755
>                 URL: https://issues.apache.org/jira/browse/HIVE-9755
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 0.14.0
>            Reporter: Naveen Gangam
>            Assignee: Naveen Gangam
>            Priority: Critical
>         Attachments: HIVE-9755.patch
>
>
> hive> describe ngramtest;
> OK
> col1                  int                                         
> col3                  string                                      
> Time taken: 0.192 seconds, Fetched: 2 row(s)
> SELECT explode(ngrams(sentences(lower(t.col3)), 3, 10)) as x FROM (SELECT 
> col3  FROM ngramtest WHERE col1=0) t;
> when any result has value equal null, returned the error. 
> 2015-01-08 09:15:00,262 FATAL ExecReducer: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{},"value":{"_col0":["0","0","0","0"]},"alias":0} 
> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:258) 
> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) 
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>  
> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> GenericUDAFnGramEvaluator: mismatch in value for 'n', which usually is caused 
> by a non-constant expression. Found '0' and '1'. 
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFnGrams$GenericUDAFnGramEvaluator.merge(GenericUDAFnGrams.java:242)
>  
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:142)
>  
> at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:658)
>  
> at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:911)
>  
> at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:753)
>  
> at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:819)
>  
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) 
> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to