[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831308#comment-15831308 ] Hive QA commented on HIVE-15565: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848220/HIVE-15565.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10963 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=226) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=226) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[cluster_tasklog_retrieval] (batchId=87) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[minimr_broken_pipe] (batchId=87) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3060/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3060/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3060/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848220 - PreCommit-HIVE-Build > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch, HIVE-15565.2.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829282#comment-15829282 ] Prasanth Jayachandran commented on HIVE-15565: -- This delays the memory check hoping memory will be freed up in the meantime. Although freeing up of memory is not guaranteed and may not happen at all because of on-heap metadata cache and when other executors are performing allocations. LGTM, +1. Pending tests. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch, HIVE-15565.2.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829231#comment-15829231 ] Rajesh Balamohan commented on HIVE-15565: - Reverted the patch. Will post a separate patch for checking "numEntriesHashTable==0" for LLAP. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829228#comment-15829228 ] Rajesh Balamohan commented on HIVE-15565: - Had offline discussion with [~prasanth_j] on this. We do not need to flush the hashTable when {{numEntriesHashTable=0}} in LLAP. We can revert the existing patch and check for this condition for LLAP. That would reduce the number of flushes by large margin. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828993#comment-15828993 ] Siddharth Seth commented on HIVE-15565: --- After the patch, we won't flush in LLAP GroupBy, correct? used/numExecutors always < maxMemory ? Before the patch - it's random, and depends on other operators running in the system. https://issues.apache.org/jira/browse/HIVE-15508 is pretty important - tracking memory per operator should help with this. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824756#comment-15824756 ] Rajesh Balamohan commented on HIVE-15565: - As per old impl, maxMemory in {{GroupByOperator}} was 5,312,784,896. After running few queries (e.g q22,18, 70, 67) heap usage increases to ~64GB. Now when q22 is rerun, {{GroupByOperator::shouldBeFlushed}} would always return true for every record due to the following calculation {noformat} usedMemory = isLlap ? usedMemory / numExecutors : usedMemory; rate = (float) usedMemory / (float) maxMemory; < 64/12 = 5.33 GB / 5312784896 bytes > memoryThreshold of 0.9. if(rate > memoryThreshold){ return true; } {noformat} Even though q22 dataset is very small, it would end up flushing for every row. Patch fixes the assumption for isTez/isLLAP. But we still need HIVE-15508 for exact memory tracking. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824655#comment-15824655 ] Sergey Shelukhin commented on HIVE-15565: - Looking at it again (and at [~prasanth_j] 's comment that I cannot quite understand)... what is the task configured memory that we are no longer using? Is it set based on Tez executor size (or equivalent) in MapRecordProcessor? It's set from Tez context. Also, why is this causing single-record flushes if we base on task memory? > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824647#comment-15824647 ] Gunther Hagleitner commented on HIVE-15565: --- I don't understand this - isn't the commit doing the exact opposite of what [~prasanth_j] says? This is flushing when the JVM memory is hit v the max allocated memory per executor. This would have a lot of bad side effects by having multiple queries/fragments step on one another... > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813221#comment-15813221 ] Sergey Shelukhin commented on HIVE-15565: - +1 > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811206#comment-15811206 ] Hive QA commented on HIVE-15565: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12846265/HIVE-15565.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10903 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=233) TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=122) [groupby3_map.q,union26.q,mapreduce1.q,mapjoin_addjar.q,bucket_map_join_spark1.q,udf_example_add.q,multi_insert_with_join.q,sample7.q,auto_join_nulls.q,ppd_outer_join4.q,load_dyn_part8.q,alter_merge_orc.q,sample6.q,bucket_map_join_1.q,auto_sortmerge_join_9.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[case_sensitivity] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_testxpath] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_coalesce] (batchId=75) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple] (batchId=146) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2839/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2839/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2839/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12846265 - PreCommit-HIVE-Build > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15565) LLAP: GroupByOperator flushes hash table too frequently
[ https://issues.apache.org/jira/browse/HIVE-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811045#comment-15811045 ] Prasanth Jayachandran commented on HIVE-15565: -- For LLAP, GBY hashtable memory should be shared by all executors. The problem is with tracking memory usage per thread. There is already an open jira for this. HIVE-15508 Ideally, we want to track memory usage per executor and flush based on usage vs max memory available per executor ratio. > LLAP: GroupByOperator flushes hash table too frequently > --- > > Key: HIVE-15565 > URL: https://issues.apache.org/jira/browse/HIVE-15565 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15565.1.patch > > > {{GroupByOperator::isTez}} would be true in LLAP mode. Current memory > computations can go wrong with {{isTez}} checks in {{GroupByOperator}}. For > e.g, in a LLAP instance with Xmx128G and 12 executors, it would start > flushing hash table for every record once it reaches around 42GB > (hive.tez.container.size=7100, hive.map.aggr.hash.percentmemory=0.5). > {noformat} > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Table flushed: new size > = 0 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_04_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > 2017-01-08T23:40:21,339 INFO [TezTaskRunner > (1480722417364_1922_7_03_12_1)] > org.apache.hadoop.hive.ql.exec.GroupByOperator: Hash Tbl flush: #hash table = > 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)