[jira] [Updated] (HIVE-21369) LLAP: Logging is expensive in encoded reader path
[ https://issues.apache.org/jira/browse/HIVE-21369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-21369: --- Attachment: HIVE-21369.patch.2 > LLAP: Logging is expensive in encoded reader path > - > > Key: HIVE-21369 > URL: https://issues.apache.org/jira/browse/HIVE-21369 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Nita Dembla >Priority: Major > Attachments: HIVE-21369.patch, HIVE-21369.patch.2 > > > There should be no INFO logging in EncodedReaderImpl. Stringifying of disk > ranges is expensive in core read path. > {code:java} > 2019-03-01T17:55:56.322852142Z 2019-03-01T17:55:56,306 INFO > [IO-Elevator-Thread-3 > (hive_20190301175546_a279f33c-4f2b-4cd5-8695-57bc8b042a61)] > encoded.EncodedReaderImpl: Disk ranges after cache (found everything true; > file [-3693547618692831801, 1551190876000, 1047660824], base offset > 792920167): [{start: 887940 end: 1003508 cache buffer: 0x5165f83d(1)}, > {start: 1003508 end: 1119078 cache buffer: 0xb63cac3(1)}, {start: 1119078 > end: 1234745 cache buffer: 0x41a724fa(1)}, {start: 1234745 end: 1350261 cache > buffer: 0x2f71bc38(1)}, {start: 1350261 end: 1465752 cache buffer: > 0x2c38e1bb(1)}, {start: 1465752 end: 1581231 cache buffer: 0x5827982(1)}, > {start: 1581231 end: 1696885 cache buffer: 0x75a6773c(1)}, {start: 1696885 > end: 1812492 cache buffer: 0x2ed060f9(1)},{start: 1812492 end: 1928086 cache > buffer: 0x20b2c8aa(1)}, {start: 1928086 end: 2043588 cache buffer: > 0x6559aacb(1)}, {start: 2043588 end: 2159089 cache buffer: 0x569c85e1(1)}, > {start: 2159089 end: 2274725 cache buffer: 0x25a88dd0(1)}, {start: 2274725 > end: 2390228 cache buffer: 0x738b7e87(1)}, {start: 2390228 end: 2505715 cache > buffer: 0x26edafa0(1)}, {start: 2505715 end: 2621322 cache buffer: > 0x69db7752(1)}, {start: 2621322 end: 2736844 cache b{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21369) LLAP: Logging is expensive in encoded reader path
[ https://issues.apache.org/jira/browse/HIVE-21369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-21369: --- Attachment: HIVE-21369.patch Status: Patch Available (was: Open) > LLAP: Logging is expensive in encoded reader path > - > > Key: HIVE-21369 > URL: https://issues.apache.org/jira/browse/HIVE-21369 > Project: Hive > Issue Type: Bug > Components: Logging >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Nita Dembla >Priority: Major > Attachments: HIVE-21369.patch > > > There should be no INFO logging in EncodedReaderImpl. Stringifying of disk > ranges is expensive in core read path. > {code:java} > 2019-03-01T17:55:56.322852142Z 2019-03-01T17:55:56,306 INFO > [IO-Elevator-Thread-3 > (hive_20190301175546_a279f33c-4f2b-4cd5-8695-57bc8b042a61)] > encoded.EncodedReaderImpl: Disk ranges after cache (found everything true; > file [-3693547618692831801, 1551190876000, 1047660824], base offset > 792920167): [{start: 887940 end: 1003508 cache buffer: 0x5165f83d(1)}, > {start: 1003508 end: 1119078 cache buffer: 0xb63cac3(1)}, {start: 1119078 > end: 1234745 cache buffer: 0x41a724fa(1)}, {start: 1234745 end: 1350261 cache > buffer: 0x2f71bc38(1)}, {start: 1350261 end: 1465752 cache buffer: > 0x2c38e1bb(1)}, {start: 1465752 end: 1581231 cache buffer: 0x5827982(1)}, > {start: 1581231 end: 1696885 cache buffer: 0x75a6773c(1)}, {start: 1696885 > end: 1812492 cache buffer: 0x2ed060f9(1)},{start: 1812492 end: 1928086 cache > buffer: 0x20b2c8aa(1)}, {start: 1928086 end: 2043588 cache buffer: > 0x6559aacb(1)}, {start: 2043588 end: 2159089 cache buffer: 0x569c85e1(1)}, > {start: 2159089 end: 2274725 cache buffer: 0x25a88dd0(1)}, {start: 2274725 > end: 2390228 cache buffer: 0x738b7e87(1)}, {start: 2390228 end: 2505715 cache > buffer: 0x26edafa0(1)}, {start: 2505715 end: 2621322 cache buffer: > 0x69db7752(1)}, {start: 2621322 end: 2736844 cache b{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21005) LLAP: Reading more stripes per-split leaks ZlibCodecs
[ https://issues.apache.org/jira/browse/HIVE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-21005: --- Attachment: HIVE-21005.patch > LLAP: Reading more stripes per-split leaks ZlibCodecs > - > > Key: HIVE-21005 > URL: https://issues.apache.org/jira/browse/HIVE-21005 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Gopal V >Assignee: Nita Dembla >Priority: Major > Attachments: HIVE-21005.patch > > > OrcEncodedDataReader - calls ensureDataReader in a loop, overwriting itself > {code} > for (int stripeIxMod = 0; stripeIxMod < stripeRgs.length; ++stripeIxMod) { > > // 6.2. Ensure we have stripe metadata. We might have read it before > for RG filtering. > if (stripeMetadatas != null) { > stripeMetadata = stripeMetadatas.get(stripeIxMod); > } else { > ... > ensureDataReader(); > ... > } > {code} > {code} > private void ensureDataReader() throws IOException { > ... > stripeReader = orcReader.encodedReader( > fileKey, dw, dw, useObjectPools ? POOL_FACTORY : null, trace, > useCodecPool, cacheTag); > {code} > creates new encodedReader without closing previous stripe's encoded reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21005) LLAP: Reading more stripes per-split leaks ZlibCodecs
[ https://issues.apache.org/jira/browse/HIVE-21005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla reassigned HIVE-21005: -- Assignee: Nita Dembla > LLAP: Reading more stripes per-split leaks ZlibCodecs > - > > Key: HIVE-21005 > URL: https://issues.apache.org/jira/browse/HIVE-21005 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Gopal V >Assignee: Nita Dembla >Priority: Major > > OrcEncodedDataReader - calls ensureDataReader in a loop, overwriting itself > {code} > for (int stripeIxMod = 0; stripeIxMod < stripeRgs.length; ++stripeIxMod) { > > // 6.2. Ensure we have stripe metadata. We might have read it before > for RG filtering. > if (stripeMetadatas != null) { > stripeMetadata = stripeMetadatas.get(stripeIxMod); > } else { > ... > ensureDataReader(); > ... > } > {code} > {code} > private void ensureDataReader() throws IOException { > ... > stripeReader = orcReader.encodedReader( > fileKey, dw, dw, useObjectPools ? POOL_FACTORY : null, trace, > useCodecPool, cacheTag); > {code} > creates new encodedReader without closing previous stripe's encoded reader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19912) Schema evolution checks prints a log line in INFO mode for each vectorized rowbatch, impacts performance
[ https://issues.apache.org/jira/browse/HIVE-19912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla reassigned HIVE-19912: -- Assignee: Nita Dembla (was: Prasanth Jayachandran) > Schema evolution checks prints a log line in INFO mode for each vectorized > rowbatch, impacts performance > > > Key: HIVE-19912 > URL: https://issues.apache.org/jira/browse/HIVE-19912 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Nita Dembla >Assignee: Nita Dembla >Priority: Major > Fix For: 3.0.1 > > Attachments: HIVE-19912.1.patch > > > While benchmarking query96, noticed 17K log lines printed for each vector > rowbactch > > In file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java > > {code:java} > @@ -2554,8 +2554,8 @@ public static TypeDescription > getDesiredRowTypeDescr(Configuration conf, > } > if (haveSchemaEvolutionProperties) { > - if (LOG.isInfoEnabled()) { > - LOG.info("Using schema evolution configuration variables > schema.evolution.columns " + > + if (LOG.isDebugEnabled()) { > + LOG.debug("Using schema evolution configuration variables > schema.evolution.columns " + > schemaEvolutionColumnNames.toString() + > " / schema.evolution.columns.types " + > schemaEvolutionTypeDescrs.toString() +{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19912) Schema evolution checks prints a log line in INFO mode for each vectorized rowbatch, impacts performance
[ https://issues.apache.org/jira/browse/HIVE-19912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-19912: --- Assignee: Prasanth Jayachandran Attachment: HIVE-19912.1.patch Status: Patch Available (was: Open) > Schema evolution checks prints a log line in INFO mode for each vectorized > rowbatch, impacts performance > > > Key: HIVE-19912 > URL: https://issues.apache.org/jira/browse/HIVE-19912 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Nita Dembla >Assignee: Prasanth Jayachandran >Priority: Major > Fix For: 3.0.1 > > Attachments: HIVE-19912.1.patch > > > While benchmarking query96, noticed 17K log lines printed for each vector > rowbactch > > In file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java > > {code:java} > @@ -2554,8 +2554,8 @@ public static TypeDescription > getDesiredRowTypeDescr(Configuration conf, > } > if (haveSchemaEvolutionProperties) { > - if (LOG.isInfoEnabled()) { > - LOG.info("Using schema evolution configuration variables > schema.evolution.columns " + > + if (LOG.isDebugEnabled()) { > + LOG.debug("Using schema evolution configuration variables > schema.evolution.columns " + > schemaEvolutionColumnNames.toString() + > " / schema.evolution.columns.types " + > schemaEvolutionTypeDescrs.toString() +{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19500) Prevent multiple selectivity estimations for the same variable in conjuctions
[ https://issues.apache.org/jira/browse/HIVE-19500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478001#comment-16478001 ] Nita Dembla commented on HIVE-19500: I didn't have a good pass rate on TPCDS with 2nd patch here. Though the change fixes query74 but causes OOM's in other queries. > Prevent multiple selectivity estimations for the same variable in conjuctions > - > > Key: HIVE-19500 > URL: https://issues.apache.org/jira/browse/HIVE-19500 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 3.1.0 >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-19500.01.patch, HIVE-19500.02.patch > > > see HIVE-19097 for problem description > for filters like: {{(d_year in (2001,2002) and d_year = 2001)}} the current > estimation is around {{(1/NDV)**2}} (iff column stats are available) > actually the source of the problem was a small typo in HIVE-17465 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18121) TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing tables partition info
[ https://issues.apache.org/jira/browse/HIVE-18121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261275#comment-16261275 ] Nita Dembla commented on HIVE-18121: [~vihangk1] Version - Hive 3.0. I don't have the fix for HIVE-17961 in my build. > TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing > tables partition info > --- > > Key: HIVE-18121 > URL: https://issues.apache.org/jira/browse/HIVE-18121 > Project: Hive > Issue Type: Bug > Components: File Formats, llap >Affects Versions: 3.0.0 >Reporter: Nita Dembla > Labels: parquet > > Testing TPCDS 1TB with LLAP Parquet cache. Ran into the following exception > {code} > 2017-11-21T00:53:33,605 ERROR [HiveServer2-Background-Pool: Thread-330] > ql.Driver: FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, > vertexId=vertex_1509557139747_0295_48_02, diagnostics=[Task failed, > taskId=task_1509557139747_0295_48_02_000105, diagnostics=[TaskAttempt 0 > failed, info=[Error: Error while running task ( failure ) : > attempt_1509557139747_0295_48_02_000105_0:java.lang.RuntimeException: > java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.IOException: > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83) > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703) > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662) > at > org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150) > at > org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:525) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:253) > ... 15 more > Caused by: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at >
[jira] [Updated] (HIVE-18121) TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing tables partition info
[ https://issues.apache.org/jira/browse/HIVE-18121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-18121: --- Affects Version/s: 3.0.0 > TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing > tables partition info > --- > > Key: HIVE-18121 > URL: https://issues.apache.org/jira/browse/HIVE-18121 > Project: Hive > Issue Type: Bug > Components: File Formats, llap >Affects Versions: 3.0.0 >Reporter: Nita Dembla > Labels: parquet > > Testing TPCDS 1TB with LLAP Parquet cache. Ran into the following exception > {code} > 2017-11-21T00:53:33,605 ERROR [HiveServer2-Background-Pool: Thread-330] > ql.Driver: FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, > vertexId=vertex_1509557139747_0295_48_02, diagnostics=[Task failed, > taskId=task_1509557139747_0295_48_02_000105, diagnostics=[TaskAttempt 0 > failed, info=[Error: Error while running task ( failure ) : > attempt_1509557139747_0295_48_02_000105_0:java.lang.RuntimeException: > java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.IOException: > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83) > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703) > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662) > at > org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150) > at > org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:525) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:253) > ... 15 more > Caused by: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:418) > at >
[jira] [Commented] (HIVE-18121) TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing tables partition info
[ https://issues.apache.org/jira/browse/HIVE-18121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261184#comment-16261184 ] Nita Dembla commented on HIVE-18121: Other TPCDS queries having the same issue - 42, 52, 55, 68, 73 and 17. > TPCDS query 19 runs throws NPE in VectorizedParquetRecordReader initializing > tables partition info > --- > > Key: HIVE-18121 > URL: https://issues.apache.org/jira/browse/HIVE-18121 > Project: Hive > Issue Type: Bug > Components: File Formats, llap >Reporter: Nita Dembla > Labels: parquet > > Testing TPCDS 1TB with LLAP Parquet cache. Ran into the following exception > {code} > 2017-11-21T00:53:33,605 ERROR [HiveServer2-Background-Pool: Thread-330] > ql.Driver: FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, > vertexId=vertex_1509557139747_0295_48_02, diagnostics=[Task failed, > taskId=task_1509557139747_0295_48_02_000105, diagnostics=[TaskAttempt 0 > failed, info=[Error: Error while running task ( failure ) : > attempt_1509557139747_0295_48_02_000105_0:java.lang.RuntimeException: > java.lang.RuntimeException: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.IOException: > java.lang.RuntimeException: java.lang.NullPointerException > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83) > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703) > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662) > at > org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150) > at > org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:525) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:171) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:253) > ... 15 more > Caused by: java.io.IOException: java.lang.RuntimeException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:418) > at >
[jira] [Commented] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column
[ https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242589#comment-16242589 ] Nita Dembla commented on HIVE-18001: Similar problem happens with Foreign Key on partition key column. alter table catalog_sales add constraint cs_d2 foreign key (cs_sold_date_sk) references date_dim (d_date_sk) disable novalidate rely; > InvalidObjectException while creating Primary Key constraint on partition key > column > > > Key: HIVE-18001 > URL: https://issues.apache.org/jira/browse/HIVE-18001 > Project: Hive > Issue Type: Bug >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez > > {code} > hive> show create table inventory; > OK > CREATE TABLE `inventory`( > `inv_item_sk` bigint, > `inv_warehouse_sk` bigint, > `inv_quantity_on_hand` int) > PARTITIONED BY ( > `inv_date_sk` bigint) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1508284425') > Time taken: 0.25 seconds, Fetched: 16 row(s) > hive> alter table inventory add constraint pk_in primary key (inv_date_sk, > inv_item_sk, inv_warehouse_sk) disable novalidate rely; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent > column not found: inv_date_sk) > {code} > Exception from the log > {code} > 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] > exec.DDLTask: Failed > org.apache.hadoop.hive.ql.metadata.HiveException: > InvalidObjectException(message:Parent column not found: inv_date_sk) > at > org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_112] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_112] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_112] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] > at org.apache.hadoop.util.RunJar.run(RunJar.java:233) > ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?] > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?] > Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: > Parent column not found: inv_date_sk > at >
[jira] [Assigned] (HIVE-18001) InvalidObjectException while creating Primary Key constraint on partition key column
[ https://issues.apache.org/jira/browse/HIVE-18001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla reassigned HIVE-18001: -- Assignee: Jesus Camacho Rodriguez > InvalidObjectException while creating Primary Key constraint on partition key > column > > > Key: HIVE-18001 > URL: https://issues.apache.org/jira/browse/HIVE-18001 > Project: Hive > Issue Type: Bug >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez > > {code} > hive> show create table inventory; > OK > CREATE TABLE `inventory`( > `inv_item_sk` bigint, > `inv_warehouse_sk` bigint, > `inv_quantity_on_hand` int) > PARTITIONED BY ( > `inv_date_sk` bigint) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION > > 'hdfs://ctr-e134-1499953498516-233086-01-02.hwx.site:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1000.db/inventory' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1508284425') > Time taken: 0.25 seconds, Fetched: 16 row(s) > hive> alter table inventory add constraint pk_in primary key (inv_date_sk, > inv_item_sk, inv_warehouse_sk) disable novalidate rely; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException(message:Parent > column not found: inv_date_sk) > {code} > Exception from the log > {code} > 2017-11-07T18:17:50,516 ERROR [d4ed6f97-20ea-4bc8-a046-b0646f483a20 main] > exec.DDLTask: Failed > org.apache.hadoop.hive.ql.metadata.HiveException: > InvalidObjectException(message:Parent column not found: inv_date_sk) > at > org.apache.hadoop.hive.ql.metadata.Hive.addPrimaryKey(Hive.java:4668) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.DDLTask.addConstraints(DDLTask.java:4356) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:413) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > ~[hive-cli-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_112] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_112] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_112] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] > at org.apache.hadoop.util.RunJar.run(RunJar.java:233) > ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?] > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > ~[hadoop-common-2.7.3.2.6.2.0-205.jar:?] > Caused by: org.apache.hadoop.hive.metastore.api.InvalidObjectException: > Parent column not found: inv_date_sk > at > org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4190) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.ObjectStore.addPrimaryKeys(ObjectStore.java:4163) >
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 {noformat} ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 {quote} ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 {{ ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 {{monospaced}} ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 {quote} ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
[jira] [Updated] (HIVE-17968) TPCDS query70 generates NPE in vectorization, works fine with vectorization disabled.
[ https://issues.apache.org/jira/browse/HIVE-17968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-17968: --- Description: Running into the following NPE while running query 70 ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1509557139747_0023_2_07, diagnostics=[Task failed, taskId=task_1509557139747_0023_2_07_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:407) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:271) Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$8.writeValue(VectorExpressionWriterFactory.java:1061) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:1005) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.writeSingleRow(VectorGroupByOperator.java:1082) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.access$800(VectorGroupByOperator.java:67) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeReduceMergePartial.close(VectorGroupByOperator.java:856) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:1123) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:709) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:383) ... 16 more ], TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : attempt_1509557139747_0023_2_07_00_1:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) at
[jira] (HIVE-15723) Hive should report a warning about missing table/column statistics to user.
[ https://issues.apache.org/jira/browse/HIVE-15723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847570#comment-15847570 ] Nita Dembla commented on HIVE-15723: hive.cbo.show.warnings should be set to to 'true' by default, since the user may not really know if he is missing statistics and may not even look for this setting. I couldn't reopen the bug. > Hive should report a warning about missing table/column statistics to user. > --- > > Key: HIVE-15723 > URL: https://issues.apache.org/jira/browse/HIVE-15723 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Remus Rusanu >Assignee: Remus Rusanu >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15723.01.patch, HIVE-15723.02.patch, > HIVE-15723.03.patch, HIVE-15723.04.patch > > > Many Hive performance issues are due to missing statistics. Either all, table > or column statistics are missing. Potentially a new partition has been added > and customer forgot to gather stats for that partition. > A simple warning about a table or column missing statistics can be very > helpful and makes hive more user friendly. Hive already has this information, > its a matter of printing it out. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14773) NPE aggregating column statistics for date column in partitioned table
[ https://issues.apache.org/jira/browse/HIVE-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-14773: --- Description: Hive runs into a NPE when the query has a filter on a date column and the partitioned column eg: select count(*) from date_dim where d_date > date "1900-01-02" and d_date_sk= 2416945; Here d_date_sk is a partition column and d_date is of type date. 2016-09-16T08:27:06,510 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.AggregateStatsCache: No aggregate stats cached for database:default, table:date_dim, column:d_date 2016-09-16T08:27:06,512 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.MetaStoreDirectSql: Direct SQL query in 1.302231ms + 0.00653ms, the query is [select "COLUMN_NAME", "COLUMN_TYPE", min("LONG_LOW_VALUE"), max("LONG_HIGH_VALUE"), min("DOUBLE_LOW_VALUE"), max("DOUBLE_HIGH_VALUE"), min(cast("BIG_DECIMAL_LOW_VALUE" as decimal)), max(cast("BIG_DECIMAL_HIGH_VALUE" as decimal)), sum("NUM_NULLS"), max("NUM_DISTINCTS"), max("AVG_COL_LEN"), max("MAX_COL_LEN"), sum("NUM_TRUES"), sum("NUM_FALSES"), avg(("LONG_HIGH_VALUE"-"LONG_LOW_VALUE")/cast("NUM_DISTINCTS" as decimal)),avg(("DOUBLE_HIGH_VALUE"-"DOUBLE_LOW_VALUE")/"NUM_DISTINCTS"),avg((cast("BIG_DECIMAL_HIGH_VALUE" as decimal)-cast("BIG_DECIMAL_LOW_VALUE" as decimal))/"NUM_DISTINCTS"),sum("NUM_DISTINCTS") from "PART_COL_STATS" where "DB_NAME" = ? and "TABLE_NAME" = ? and "COLUMN_NAME" in (?) and "PARTITION_NAME" in (?) group by "COLUMN_NAME", "COLUMN_TYPE"] 2016-09-16T08:27:06,526 INFO [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.MetaStoreDirectSql: useDensityFunctionForNDVEstimation = false partsFound = 1 ColumnStatisticsObj = [ColumnStatisticsObj(colName:d_date, colType:date, statsData:)] 2016-09-16T08:27:06,526 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.ObjectStore: Commit transaction: count = 0, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.commit(ObjectStore.java:2827) 2016-09-16T08:27:06,531 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.ObjectStore: null retrieved using SQL in 43.425925ms 2016-09-16T08:27:06,545 ERROR [90d4780f-77e4-4704-9907-4860ce11a206 main] ql.Driver: FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.metastore.api.ColumnStatisticsData.getFieldDesc(ColumnStatisticsData.java:451) at org.apache.hadoop.hive.metastore.api.ColumnStatisticsData.getDateStats(ColumnStatisticsData.java:574) at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:759) at org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:806) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:304) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:152) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:140) at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:260) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:129) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10928) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:255) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:251) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:467) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1235) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1355) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1143) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at
[jira] [Updated] (HIVE-14773) NPE aggregating column statistics for date column in partitioned table
[ https://issues.apache.org/jira/browse/HIVE-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-14773: --- Description: Hive runs into a NPE when the query has a filter on a date column and the partitioned column eg: select count(*) from date_dim where d_date > date "1900-01-02" and d_date_sk= 2416945; Here d_date_sk is a partition column and d_date is of type date. {code} 2016-09-16T08:27:06,510 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.AggregateStatsCache: No aggregate stats cached for database:default, table:date_dim, column:d_date 2016-09-16T08:27:06,512 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.MetaStoreDirectSql: Direct SQL query in 1.302231ms + 0.00653ms, the query is [select "COLUMN_NAME", "COLUMN_TYPE", min("LONG_LOW_VALUE"), max("LONG_HIGH_VALUE"), min("DOUBLE_LOW_VALUE"), max("DOUBLE_HIGH_VALUE"), min(cast("BIG_DECIMAL_LOW_VALUE" as decimal)), max(cast("BIG_DECIMAL_HIGH_VALUE" as decimal)), sum("NUM_NULLS"), max("NUM_DISTINCTS"), max("AVG_COL_LEN"), max("MAX_COL_LEN"), sum("NUM_TRUES"), sum("NUM_FALSES"), avg(("LONG_HIGH_VALUE"-"LONG_LOW_VALUE")/cast("NUM_DISTINCTS" as decimal)),avg(("DOUBLE_HIGH_VALUE"-"DOUBLE_LOW_VALUE")/"NUM_DISTINCTS"),avg((cast("BIG_DECIMAL_HIGH_VALUE" as decimal)-cast("BIG_DECIMAL_LOW_VALUE" as decimal))/"NUM_DISTINCTS"),sum("NUM_DISTINCTS") from "PART_COL_STATS" where "DB_NAME" = ? and "TABLE_NAME" = ? and "COLUMN_NAME" in (?) and "PARTITION_NAME" in (?) group by "COLUMN_NAME", "COLUMN_TYPE"] 2016-09-16T08:27:06,526 INFO [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.MetaStoreDirectSql: useDensityFunctionForNDVEstimation = false partsFound = 1 ColumnStatisticsObj = [ColumnStatisticsObj(colName:d_date, colType:date, statsData:)] 2016-09-16T08:27:06,526 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.ObjectStore: Commit transaction: count = 0, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.commit(ObjectStore.java:2827) 2016-09-16T08:27:06,531 DEBUG [90d4780f-77e4-4704-9907-4860ce11a206 main] metastore.ObjectStore: null retrieved using SQL in 43.425925ms 2016-09-16T08:27:06,545 ERROR [90d4780f-77e4-4704-9907-4860ce11a206 main] ql.Driver: FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.metastore.api.ColumnStatisticsData.getFieldDesc(ColumnStatisticsData.java:451) at org.apache.hadoop.hive.metastore.api.ColumnStatisticsData.getDateStats(ColumnStatisticsData.java:574) at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:759) at org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:806) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:304) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:152) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:140) at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:260) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:129) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10928) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:255) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:251) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:467) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1235) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1355) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1143) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at
[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions
[ https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288149#comment-15288149 ] Nita Dembla commented on HIVE-12643: I've tested a slightly modified version of the patch. Original changes to following files were rejected - ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java - ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java And following file needed modifications - ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java > For self describing InputFormat don't replicate schema information in > partitions > > > Key: HIVE-12643 > URL: https://issues.apache.org/jira/browse/HIVE-12643 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, > HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch > > > Since self describing Input Formats don't use individual partition schemas > for schema resolution, there is no need to send that info to tasks. > Doing this should cut down plan size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13320) Apply HIVE-11544 to explicit conversions as well as implicit ones
[ https://issues.apache.org/jira/browse/HIVE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-13320: --- Attachment: HIVE-13320.2.patch > Apply HIVE-11544 to explicit conversions as well as implicit ones > - > > Key: HIVE-13320 > URL: https://issues.apache.org/jira/browse/HIVE-13320 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Nita Dembla > Attachments: HIVE-13320.1.patch, HIVE-13320.2.patch, > HIVE-13320.2.patch > > > Parsing 1 million blank values through cast(x as int) is 3x slower than > parsing a valid single digit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13320) Apply HIVE-11544 to explicit conversions as well as implicit ones
[ https://issues.apache.org/jira/browse/HIVE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-13320: --- Attachment: HIVE-13320.2.patch > Apply HIVE-11544 to explicit conversions as well as implicit ones > - > > Key: HIVE-13320 > URL: https://issues.apache.org/jira/browse/HIVE-13320 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Nita Dembla > Attachments: HIVE-13320.1.patch, HIVE-13320.2.patch > > > Parsing 1 million blank values through cast(x as int) is 3x slower than > parsing a valid single digit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13320) Apply HIVE-11544 to explicit conversions as well as implicit ones
[ https://issues.apache.org/jira/browse/HIVE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-13320: --- Status: Patch Available (was: Open) > Apply HIVE-11544 to explicit conversions as well as implicit ones > - > > Key: HIVE-13320 > URL: https://issues.apache.org/jira/browse/HIVE-13320 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.0.0, 1.2.1, 1.3.0, 2.1.0 >Reporter: Gopal V >Assignee: Nita Dembla > Attachments: HIVE-13320.1.patch, HIVE-13320.2.patch > > > Parsing 1 million blank values through cast(x as int) is 3x slower than > parsing a valid single digit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13320) Apply HIVE-11544 to explicit conversions as well as implicit ones
[ https://issues.apache.org/jira/browse/HIVE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-13320: --- Status: Open (was: Patch Available) > Apply HIVE-11544 to explicit conversions as well as implicit ones > - > > Key: HIVE-13320 > URL: https://issues.apache.org/jira/browse/HIVE-13320 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.0.0, 1.2.1, 1.3.0, 2.1.0 >Reporter: Gopal V >Assignee: Nita Dembla > Attachments: HIVE-13320.1.patch > > > Parsing 1 million blank values through cast(x as int) is 3x slower than > parsing a valid single digit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)