[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-14290: - Attachment: HIVE-14290.1.patch Re-uploading patch to trigger Hive QA > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch, > HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387122#comment-15387122 ] Peter Slawski commented on HIVE-14290: -- Thank you [~prasanth_j] for the review. It looks like an unrelated error caused the build to fail. I have attached the same patch again to this JIRA to hopefully trigger the QA build. {code} Could not transfer artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to datanucleus {code} > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-14290: - Attachment: HIVE-14290.1.patch > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-14290: - Target Version/s: 2.2.0 Fix Version/s: (was: 2.2.0) > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-14290: - Fix Version/s: 2.2.0 Status: Patch Available (was: Open) I've attached a patch that makes this minor refactor. > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Fix For: 2.2.0 > > Attachments: HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13699) Make JavaDataModel#get thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-13699: - Attachment: HIVE-13699.2.patch Attached updated patch with fix to use SLF4J logger. > Make JavaDataModel#get thread safe for parallel compilation > --- > > Key: HIVE-13699 > URL: https://issues.apache.org/jira/browse/HIVE-13699 > Project: Hive > Issue Type: Bug > Components: HiveServer2, storage-api >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13699.1.patch, HIVE-13699.2.patch > > > The class JavaDataModel has a static method, #get, that is not thread safe. > This may be an issue when parallel query compilation is enabled because two > threads may attempt to call JavaDataModel#get at the same time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13699) Make JavaDataModel#get thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289449#comment-15289449 ] Peter Slawski commented on HIVE-13699: -- Yeah, I should be using SLF4J. I will correct and post an updated patch. This is a preemptive patch found by doing static analysis on the code path for Driver#compile. > Make JavaDataModel#get thread safe for parallel compilation > --- > > Key: HIVE-13699 > URL: https://issues.apache.org/jira/browse/HIVE-13699 > Project: Hive > Issue Type: Bug > Components: HiveServer2, storage-api >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13699.1.patch > > > The class JavaDataModel has a static method, #get, that is not thread safe. > This may be an issue when parallel query compilation is enabled because two > threads may attempt to call JavaDataModel#get at the same time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13699) Make JavaDataModel#get thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278857#comment-15278857 ] Peter Slawski commented on HIVE-13699: -- The tests that failed look to be unrelated. All the tests that failed had also failed in the Jenkins build before this one, see [http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/205/console]. To double check, I rerun the failed unit tests to verify this patch did not introduce failures. I noticed some of them are now fixed on latest-pulled master with my patch applied to it. Note the change itself is isolated to change implementations of JavaDataModel#get making it thread safe, it should not change behavior as the added unit tests show. > Make JavaDataModel#get thread safe for parallel compilation > --- > > Key: HIVE-13699 > URL: https://issues.apache.org/jira/browse/HIVE-13699 > Project: Hive > Issue Type: Bug > Components: HiveServer2, storage-api >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13699.1.patch > > > The class JavaDataModel has a static method, #get, that is not thread safe. > This may be an issue when parallel query compilation is enabled because two > threads may attempt to call JavaDataModel#get at the same time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13699) Make JavaDataModel#get thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-13699: - Status: Patch Available (was: Open) I have attached a patch which fixes this by initializing the system's JavaDataModel via the lazy holder pattern to ensure thread safety. > Make JavaDataModel#get thread safe for parallel compilation > --- > > Key: HIVE-13699 > URL: https://issues.apache.org/jira/browse/HIVE-13699 > Project: Hive > Issue Type: Bug > Components: HiveServer2, storage-api >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13699.1.patch > > > The class JavaDataModel has a static method, #get, that is not thread safe. > This may be an issue when parallel query compilation is enabled because two > threads may attempt to call JavaDataModel#get at the same time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13699) Make JavaDataModel#get thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-13699: - Attachment: HIVE-13699.1.patch > Make JavaDataModel#get thread safe for parallel compilation > --- > > Key: HIVE-13699 > URL: https://issues.apache.org/jira/browse/HIVE-13699 > Project: Hive > Issue Type: Bug > Components: HiveServer2, storage-api >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13699.1.patch > > > The class JavaDataModel has a static method, #get, that is not thread safe. > This may be an issue when parallel query compilation is enabled because two > threads may attempt to call JavaDataModel#get at the same time, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13512) Make initializing dag ids in TezWork thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264476#comment-15264476 ] Peter Slawski commented on HIVE-13512: -- [~gopalv], could you please confirm my above statement regarding the test failures. I would like to know the next steps I need to take for getting this patch in. Thank you! > Make initializing dag ids in TezWork thread safe for parallel compilation > - > > Key: HIVE-13512 > URL: https://issues.apache.org/jira/browse/HIVE-13512 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13512.1.patch, HIVE-13512.1.patch > > > When parallel query compilation is enabled, it is possible for concurrent > running threads to create TezWork objects that have the same dag id. This is > because the counter used to obtain the next dag id is not thread safe. The > counter should be an AtomicInteger rather than an int. > {code:java} > private static int counter; > ... > public TezWork(String queryId, Configuration conf) { > this.dagId = queryId + ":" + (++counter); > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13512) Make initializing dag ids in TezWork thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259608#comment-15259608 ] Peter Slawski commented on HIVE-13512: -- The test failure appear not to be related to this patch. Between the two tests run, the patch has not changed. The common test fail between the two runs is TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3. However, this is using TestMiniSparkOnYarnCliDriver which should not be creating TezWork objects, which this patch only touches. > Make initializing dag ids in TezWork thread safe for parallel compilation > - > > Key: HIVE-13512 > URL: https://issues.apache.org/jira/browse/HIVE-13512 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13512.1.patch, HIVE-13512.1.patch > > > When parallel query compilation is enabled, it is possible for concurrent > running threads to create TezWork objects that have the same dag id. This is > because the counter used to obtain the next dag id is not thread safe. The > counter should be an AtomicInteger rather than an int. > {code:java} > private static int counter; > ... > public TezWork(String queryId, Configuration conf) { > this.dagId = queryId + ":" + (++counter); > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13512) Make initializing dag ids in TezWork thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-13512: - Attachment: HIVE-13512.1.patch Attaching same patch to trigger tests. The logs for the previous test run are lost due to the Jenkins server being down. > Make initializing dag ids in TezWork thread safe for parallel compilation > - > > Key: HIVE-13512 > URL: https://issues.apache.org/jira/browse/HIVE-13512 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13512.1.patch, HIVE-13512.1.patch > > > When parallel query compilation is enabled, it is possible for concurrent > running threads to create TezWork objects that have the same dag id. This is > because the counter used to obtain the next dag id is not thread safe. The > counter should be an AtomicInteger rather than an int. > {code:java} > private static int counter; > ... > public TezWork(String queryId, Configuration conf) { > this.dagId = queryId + ":" + (++counter); > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13512) Make initializing dag ids in TezWork thread safe for parallel compilation
[ https://issues.apache.org/jira/browse/HIVE-13512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-13512: - Status: Patch Available (was: Open) I have attached a patch which replaces int with AtomicInteger. > Make initializing dag ids in TezWork thread safe for parallel compilation > - > > Key: HIVE-13512 > URL: https://issues.apache.org/jira/browse/HIVE-13512 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning >Affects Versions: 2.0.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-13512.1.patch > > > When parallel query compilation is enabled, it is possible for concurrent > running threads to create TezWork objects that have the same dag id. This is > because the counter used to obtain the next dag id is not thread safe. The > counter should be an AtomicInteger rather than an int. > {code:java} > private static int counter; > ... > public TezWork(String queryId, Configuration conf) { > this.dagId = queryId + ":" + (++counter); > ... > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9147) Add unit test for HIVE-7323
[ https://issues.apache.org/jira/browse/HIVE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117741#comment-15117741 ] Peter Slawski commented on HIVE-9147: - Thank you [~ashutoshc] for reviewing this change. The test failures are not relevant to this change as this change only adds a new unit test. Are there any further actions on my part to take? This has been opened for a while and I am hoping it can resolved. > Add unit test for HIVE-7323 > --- > > Key: HIVE-9147 > URL: https://issues.apache.org/jira/browse/HIVE-9147 > Project: Hive > Issue Type: Test > Components: Statistics >Affects Versions: 0.14.0, 0.13.1 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Minor > Attachments: HIVE-9147.1.patch, HIVE-9147.2.patch > > > This unit test verifies that DateStatisticImpl doesn't store mutable objects > from callers for minimum and maximum values. This ensures callers cannot > modify the internal minimum and maximum values outside of DateStatisticImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9147) Add unit test for HIVE-7323
[ https://issues.apache.org/jira/browse/HIVE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-9147: Attachment: HIVE-9147.2.patch I rebased this patch on top of the latest master branch. Please see attachments. > Add unit test for HIVE-7323 > --- > > Key: HIVE-9147 > URL: https://issues.apache.org/jira/browse/HIVE-9147 > Project: Hive > Issue Type: Test > Components: Statistics >Affects Versions: 0.14.0, 0.13.1 >Reporter: Peter Slawski >Priority: Minor > Attachments: HIVE-9147.1.patch, HIVE-9147.2.patch > > > This unit test verifies that DateStatisticImpl doesn't store mutable objects > from callers for minimum and maximum values. This ensures callers cannot > modify the internal minimum and maximum values outside of DateStatisticImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533450#comment-14533450 ] Peter Slawski commented on HIVE-10538: -- [~prasanth_j], Thanks for resolving that last failed test and getting this fix in. I was initially using Oracle's JDK 1.7.0_60, but didn't have luck with 1.7.0_45 either. I also tried the OpenJDK version I already had installed, 1.7.0_79. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Assignee: Peter Slawski Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.2.patch, HIVE-10538.3.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-10538: - Attachment: HIVE-10538.2.patch I've attached the second revision of the patch which updates failed Spark qtests. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Assignee: Peter Slawski Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.2.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529707#comment-14529707 ] Peter Slawski commented on HIVE-10538: -- Great, I've been working on just that. I'll be able to posted an updated patch tomorrow. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Assignee: Peter Slawski Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.1.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529673#comment-14529673 ] Peter Slawski commented on HIVE-10538: -- The Spark driver failures are caused by this change. This would be expected if a row's hashcode affected its ordering in Spark. This patch makes it so that HiveKey's hashcode outputted from ReduceSinkOperator is no longer always multiplied by 31 (as explained previously). Also, for at least those failed qtests, the row ordering/output in the expected output differs across MapRed, Tez, and Spark. So, execution engine affects ordering. From [spark/groupby_complex_types_multi_single_reducer.q.out#L221|https://github.com/apache/hive/blob/master/ql/src/test/results/clientpositive/spark/groupby_complex_types_multi_single_reducer.q.out#L221] {code} POSTHOOK: query: SELECT DEST2.* FROM DEST2 POSTHOOK: type: QUERY POSTHOOK: Input: default@dest2 A masked pattern was here {120:val_120} 2 {129:val_129} 2 {160:val_160} 1 {26:val_26} 2 {27:val_27} 1 {288:val_288} 2 {298:val_298} 3 {30:val_30} 1 {311:val_311} 3 {74:val_74} 1 {code} From [groupby_complex_types_multi_single_reducer.q.out#L240|https://github.com/apache/hive/blob/master/ql/src/test/results/clientpositive/groupby_complex_types_multi_single_reducer.q.out#L240] {code} POSTHOOK: query: SELECT DEST2.* FROM DEST2 POSTHOOK: type: QUERY POSTHOOK: Input: default@dest2 A masked pattern was here {0:val_0} 3 {10:val_10} 1 {100:val_100} 2 {103:val_103} 2 {104:val_104} 2 {105:val_105} 1 {11:val_11} 1 {111:val_111} 1 {113:val_113} 2 {114:val_114} 1 {code} Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Assignee: Peter Slawski Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.1.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529134#comment-14529134 ] Peter Slawski commented on HIVE-10538: -- Will do. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Assignee: Peter Slawski Priority: Critical Fix For: 1.2.0, 1.3.0 Attachments: HIVE-10538.1.patch, HIVE-10538.1.patch, HIVE-10538.1.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523765#comment-14523765 ] Peter Slawski commented on HIVE-10538: -- Yes. Here is an explanation that refers to how this transient is used. The transient is used to compute the row's hash in [ReduceSinkOperator.java#L368|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L368]. {code} hashCode = computeHashCode(row, bucketNumber); {code} If the given the bucket number is valid (which would always be as the transient is initialized to a valid number) the computed hashcode would be always multiplied by 31, see [ReduceSinkOperator.java#L488|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L488]: {code} ... private int computeHashCode(Object row, int buckNum) throws HiveException { ... } else { for (int i = 0; i partitionEval.length; i++) { Object o = partitionEval[i].evaluate(row); keyHashCode = keyHashCode * 31 + ObjectInspectorUtils.hashCode(o, partitionObjectInspectors[i]); } } int hashCode = buckNum 0 ? keyHashCode : keyHashCode * 31 + buckNum; ... return hashCode; } {code} FileSinkOperator recomputes the hashcode at findWriterOffset(), but won't multiple by 31. This causes a different bucket number to be computed than expected. bucketMap only contains mappings for the bucket numbers that is expected for the current reducer to receive. From [FileSinkOperator.java#L811|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L811]: {code} private int findWriterOffset(Object row) throws HiveException { ... for (int i = 0; i partitionEval.length; i++) { Object o = partitionEval[i].evaluate(row); keyHashCode = keyHashCode * 31 + ObjectInspectorUtils.hashCode(o, partitionObjectInspectors[i]); } key.setHashCode(keyHashCode); int bucketNum = prtner.getBucket(key, null, totalFiles); return bucketMap.get(bucketNum); } {code} The transient was introduced in [HIVE-8151] which refactored the bucket number from a local variable to a transient field. Initially, the local variable was initialized to -1. The refactor changed the code so that the transient field was used instead. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Fix For: 1.3.0 Attachments: HIVE-10538.1.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at
[jira] [Updated] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-10538: - Attachment: HIVE-10538.1.patch Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski Attachments: HIVE-10538.1.patch A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10538) Fix NPE in FileSinkOperator from hashcode mismatch
[ https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520122#comment-14520122 ] Peter Slawski commented on HIVE-10538: -- Currently working on testing the fix for this issue. Fix NPE in FileSinkOperator from hashcode mismatch -- Key: HIVE-10538 URL: https://issues.apache.org/jira/browse/HIVE-10538 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0, 1.2.0 Reporter: Peter Slawski A Null Pointer Exception occurs when in FileSinkOperator when using bucketed tables and distribute by with multiFileSpray enabled. The following snippet query reproduces this issue: {code} set hive.enforce.bucketing = true; set hive.exec.reducers.max = 20; create table bucket_a(key int, value_a string) clustered by (key) into 256 buckets; create table bucket_b(key int, value_b string) clustered by (key) into 256 buckets; create table bucket_ab(key int, value_a string, value_b string) clustered by (key) into 256 buckets; -- Insert data into bucket_a and bucket_b insert overwrite table bucket_ab select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key = b.key) distribute by key; {code} The following stack trace is logged. {code} 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer (ExecReducer.java:reduce(255)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{},value:{_col0:113,_col1:val_113}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)