[jira] [Commented] (HIVE-4804) parallel order by fails for small datasets

Navis (JIRA) Sun, 07 Jul 2013 19:32:41 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701739#comment-13701739
 ]


Navis commented on HIVE-4804:
-----------------------------

I've used kv5.txt for input data (has 25 rows) and default sampling rate was 
0.1f. For 4 reducer, sampler need unique 3 keys but by the total number and the 
sampling rate on it, the sampler could not attain this requirements randomly. 

All I've done in latest patch was to make separate conditions which has enough 
sample(0.66) and which was not(0.0000.1) by setting sampling rate. 

                
> parallel order by fails for small datasets
> ------------------------------------------
>
>                 Key: HIVE-4804
>                 URL: https://issues.apache.org/jira/browse/HIVE-4804
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4804.D11571.1.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error in configuring object
>       at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>       at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>       at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>       at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.<init>(MapTask.java:481)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:416)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
>       at org.apache.hadoop.mapred.Child.main(Child.java:260)
> Caused by: java.lang.reflect.InvocationTargetException
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:616)
>       at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>       ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
>       at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:91)
>       at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
>       ... 15 more
> Caused by: java.io.IOException: Split points are out of order
>       at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:78)
>       ... 16 more
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4804) parallel order by fails for small datasets

Reply via email to