[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4957: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Thank you for the contribution Shreepadma! I have committed this to trunk! Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan Fix For: 0.13.0 Attachments: HIVE-4957.1.patch, HIVE-4957.2.patch normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4957: - Attachment: HIVE-4957.2.patch Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan Attachments: HIVE-4957.1.patch, HIVE-4957.2.patch normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4957: - Status: Open (was: Patch Available) Comments on reviewboard. Thanks. Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan Attachments: HIVE-4957.1.patch normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4957: - Attachment: HIVE-4957.1.patch Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan Attachments: HIVE-4957.1.patch normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4957) Restrict number of bit vectors, to prevent out of Java heap memory
[ https://issues.apache.org/jira/browse/HIVE-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4957: - Status: Patch Available (was: In Progress) Restrict number of bit vectors, to prevent out of Java heap memory -- Key: HIVE-4957 URL: https://issues.apache.org/jira/browse/HIVE-4957 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Brock Noland Assignee: Shreepadma Venugopalan normally increase number of bit vectors will increase calculation accuracy. Let's say {noformat} select compute_stats(a, 40) from test_hive; {noformat} generally get better accuracy than {noformat} select compute_stats(a, 16) from test_hive; {noformat} But larger number of bit vectors also cause query run slower. When number of bit vectors over 50, it won't help to increase accuracy anymore. But it still increase memory usage, and crash Hive if number if too huge. Current Hive doesn't prevent user use ridiculous large number of bit vectors in 'compute_stats' query. One example {noformat} select compute_stats(a, 9) from column_eight_types; {noformat} crashes Hive. {noformat} 2012-12-20 23:21:52,247 Stage-1 map = 0%, reduce = 0% 2012-12-20 23:22:11,315 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.29 sec MapReduce Total cumulative CPU time: 290 msec Ended Job = job_1354923204155_0777 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cs-10-20-81-171.cloud.cloudera.com:8088/proxy/application_1354923204155_0777/ Examining task ID: task_1354923204155_0777_m_00 (and more) from job job_1354923204155_0777 Task with the most failures(4): - Task ID: task_1354923204155_0777_m_00 URL: http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1354923204155_0777tipid=task_1354923204155_0777_m_00 - Diagnostic Messages for this Task: Error: Java heap space {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira