[ 
https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801123#comment-13801123
 ] 

Hudson commented on HIVE-4957:
------------------------------

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #209 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/209/])
HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory 
(Shreepadma Venugopalan via Brock Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1534337)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
* /hive/trunk/ql/src/test/queries/clientnegative/compute_stats_long.q
* /hive/trunk/ql/src/test/results/clientnegative/compute_stats_long.q.out


> Restrict number of bit vectors, to prevent out of Java heap memory
> ------------------------------------------------------------------
>
>                 Key: HIVE-4957
>                 URL: https://issues.apache.org/jira/browse/HIVE-4957
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: Brock Noland
>            Assignee: Shreepadma Venugopalan
>             Fix For: 0.13.0
>
>         Attachments: HIVE-4957.1.patch, HIVE-4957.2.patch
>
>
> normally increase number of bit vectors will increase calculation accuracy. 
> Let's say
> {noformat}
> select compute_stats(a, 40) from test_hive;
> {noformat}
> generally get better accuracy than
> {noformat}
> select compute_stats(a, 16) from test_hive;
> {noformat}
> But larger number of bit vectors also cause query run slower. When number of 
> bit vectors over 50, it won't help to increase accuracy anymore. But it still 
> increase memory usage, and crash Hive if number if too huge. Current Hive 
> doesn't prevent user use ridiculous large number of bit vectors in 
> 'compute_stats' query.
> One example
> {noformat}
> select compute_stats(a, 999999999) from column_eight_types;
> {noformat}
> crashes Hive.
> {noformat}
> 2012-12-20 23:21:52,247 Stage-1 map = 0%,  reduce = 0%
> 2012-12-20 23:22:11,315 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 0.29 
> sec
> MapReduce Total cumulative CPU time: 290 msec
> Ended Job = job_1354923204155_0777 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: 
> http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/
> Examining task ID: task_1354923204155_0777_m_000000 (and more) from job 
> job_1354923204155_0777
> Task with the most failures(4): 
> -----
> Task ID:
>   task_1354923204155_0777_m_000000
> URL:
>   
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777&tipid=task_1354923204155_0777_m_000000
> -----
> Diagnostic Messages for this Task:
> Error: Java heap space
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to