[jira] [Commented] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226291#comment-16226291 ] Hive QA commented on HIVE-17926: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894871/HIVE-17926.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11356 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=62) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=155) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=93) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=232) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7565/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7565/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7565/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12894871 - PreCommit-HIVE-Build > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17766) Support non-equi LEFT SEMI JOIN
[ https://issues.apache.org/jira/browse/HIVE-17766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226240#comment-16226240 ] Hive QA commented on HIVE-17766: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894844/HIVE-17766.05.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11344 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=62) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=156) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=102) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=94) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=111) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query83] (batchId=245) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=206) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=223) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=230) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7564/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7564/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7564/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12894844 - PreCommit-HIVE-Build > Support non-equi LEFT SEMI JOIN > --- > > Key: HIVE-17766 > URL: https://issues.apache.org/jira/browse/HIVE-17766 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17766.01.patch, HIVE-17766.02.patch, > HIVE-17766.03.patch, HIVE-17766.04.patch, HIVE-17766.05.patch, > HIVE-17766.patch > > > Currently we get an error like {noformat}Non equality condition not supported > in Semi-Join{noformat} > This is required to generate better plan for EXISTS/IN correlated subquery > where such queries are transformed into LEFT SEMI JOIN. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load
[ https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226214#comment-16226214 ] Daniel Dai commented on HIVE-17595: --- Couple of comments: 1. Can you take a note what problem did you see if updating database last.repl.id earlier? Is that some bootstrap tasks get skipped? 2. Name EfficientDAGTraversal, do we have a regular DAGTraversal? If not, probably just leave DAGTraversal is better; DependencyCollectionFunction, might be better AddDependencyToLeaves? 3. How about createEndReplLogTask? Shall we do it after all tasks as well? > Correct DAG for updating the last.repl.id for a database during bootstrap load > -- > > Key: HIVE-17595 > URL: https://issues.apache.org/jira/browse/HIVE-17595 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-17595.0.patch, HIVE-17595.1.patch, > HIVE-17595.2.patch > > > We update the last.repl.id as a database property. This is done after all the > bootstrap tasks to load the relevant data are done and is the last task to be > run. however we are currently not setting up the DAG correctly for this task. > This is getting added as the root task for now where as it should be the last > task to be run in a DAG. This becomes more important after the inclusion of > HIVE-17426 since this will lead to parallel execution and incorrect DAG's > will lead to incorrect results/state of the system. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17841) implement applying the resource plan
[ https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226185#comment-16226185 ] Hive QA commented on HIVE-17841: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894847/HIVE-17841.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 11346 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=62) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=155) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=93) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedFiles (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers1 (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerTotalTasks (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7563/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7563/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7563/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12894847 - PreCommit-HIVE-Build > implement applying the resource plan > > > Key: HIVE-17841 > URL: https://issues.apache.org/jira/browse/HIVE-17841 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, > HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, > HIVE-17841.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226095#comment-16226095 ] Hive QA commented on HIVE-17458: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894858/HIVE-17458.12.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 11324 tests executed *Failed tests:* {noformat} TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=173) [infer_bucket_sort_reducers_power_two.q,list_bucket_dml_10.q,orc_merge9.q,leftsemijoin_mr.q,bucket6.q,bucketmapjoin7.q,uber_reduce.q,empty_dir_in_table.q,index_bitmap_auto.q,vector_outer_join2.q,spark_explain_groupbyshuffle.q,spark_dynamic_partition_pruning.q,spark_combine_equivalent_work.q,orc_merge1.q,spark_use_op_stats.q,orc_merge_diff_fs.q,quotedid_smb.q,truncate_column_buckets.q,spark_vectorized_dynamic_partition_pruning.q,orc_merge3.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization_original] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=78) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[bucket5] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge10] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge1] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[reduce_deduplicate] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[tez_union_dynamic_partition_2] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket4] (batchId=175) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucket5] (batchId=175) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[disable_merge_for_bucketing] (batchId=176) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2] (batchId=176) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge4] (batchId=175) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge5] (batchId=174) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge6] (batchId=174) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7] (batchId=176) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[reduce_deduplicate] (batchId=176) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[acid_vectorization_original_tez] (batchId=102) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=102) org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_dyn_part] (batchId=89) org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[infer_bucket_sort_map_operators] (batchId=89) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=94) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=206) org.apache.hadoop.hive.metastore.security.TestHadoopAuthBridge23.testSaslWithHiveMetaStore (batchId=236) org.apache.hadoop.hive.ql.io.orc.TestVectorizedOrcAcidRowBatchReader.testVectorizedOrcAcidRowBatchReader (batchId=266) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=223) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=230) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=230) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7562/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7562/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7562/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12894858 - PreCommit-HIVE-Build > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files >
[jira] [Updated] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches
[ https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17936: -- Attachment: HIVE-17936.2.patch Applied the review comments. > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > > Key: HIVE-17936 > URL: https://issues.apache.org/jira/browse/HIVE-17936 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17936.1.patch, HIVE-17936.2.patch > > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths
[ https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226071#comment-16226071 ] Vihang Karajgaonkar commented on HIVE-17696: Thanks [~Ferd] Can you also please merge this patch to branch-2? > Vectorized reader does not seem to be pushing down projection columns in > certain code paths > --- > > Key: HIVE-17696 > URL: https://issues.apache.org/jira/browse/HIVE-17696 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Fix For: 3.0.0 > > Attachments: HIVE-17696.2.patch, HIVE-17696.patch > > > This is the code snippet from {{VectorizedParquetRecordReader.java}} > {noformat} > MessageType tableSchema; > if (indexAccess) { > List indexSequence = new ArrayList<>(); > // Generates a sequence list of indexes > for(int i = 0; i < columnNamesList.size(); i++) { > indexSequence.add(i); > } > tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, > columnNamesList, > indexSequence); > } else { > tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, > columnNamesList, > columnTypesList); > } > indexColumnsWanted = > ColumnProjectionUtils.getReadColumnIDs(configuration); > if (!ColumnProjectionUtils.isReadAllColumns(configuration) && > !indexColumnsWanted.isEmpty()) { > requestedSchema = > DataWritableReadSupport.getSchemaByIndex(tableSchema, > columnNamesList, indexColumnsWanted); > } else { > requestedSchema = fileSchema; > } > this.reader = new ParquetFileReader( > configuration, footer.getFileMetaData(), file, blocks, > requestedSchema.getColumns()); > {noformat} > Couple of things to notice here: > Most of this code is duplicated from {{DataWritableReadSupport.init()}} > method. > the else condition passes in fileSchema instead of using tableSchema like we > do in DataWritableReadSupport.init() method. Does this cause projection > columns to be missed when we read parquet files? We should probably just > reuse ReadContext returned from {{DataWritableReadSupport.init()}} method > here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17945) Support column projection for index access when using Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-17945: Issue Type: Sub-task (was: Bug) Parent: HIVE-14826 > Support column projection for index access when using Parquet Vectorization > --- > > Key: HIVE-17945 > URL: https://issues.apache.org/jira/browse/HIVE-17945 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-17945.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths
[ https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226067#comment-16226067 ] Ferdinand Xu commented on HIVE-17696: - Thanks [~vihangk1] for pointing this out. I filed HIVE-17945 to address it. > Vectorized reader does not seem to be pushing down projection columns in > certain code paths > --- > > Key: HIVE-17696 > URL: https://issues.apache.org/jira/browse/HIVE-17696 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Fix For: 3.0.0 > > Attachments: HIVE-17696.2.patch, HIVE-17696.patch > > > This is the code snippet from {{VectorizedParquetRecordReader.java}} > {noformat} > MessageType tableSchema; > if (indexAccess) { > List indexSequence = new ArrayList<>(); > // Generates a sequence list of indexes > for(int i = 0; i < columnNamesList.size(); i++) { > indexSequence.add(i); > } > tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, > columnNamesList, > indexSequence); > } else { > tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, > columnNamesList, > columnTypesList); > } > indexColumnsWanted = > ColumnProjectionUtils.getReadColumnIDs(configuration); > if (!ColumnProjectionUtils.isReadAllColumns(configuration) && > !indexColumnsWanted.isEmpty()) { > requestedSchema = > DataWritableReadSupport.getSchemaByIndex(tableSchema, > columnNamesList, indexColumnsWanted); > } else { > requestedSchema = fileSchema; > } > this.reader = new ParquetFileReader( > configuration, footer.getFileMetaData(), file, blocks, > requestedSchema.getColumns()); > {noformat} > Couple of things to notice here: > Most of this code is duplicated from {{DataWritableReadSupport.init()}} > method. > the else condition passes in fileSchema instead of using tableSchema like we > do in DataWritableReadSupport.init() method. Does this cause projection > columns to be missed when we read parquet files? We should probably just > reuse ReadContext returned from {{DataWritableReadSupport.init()}} method > here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17945) Support column projection for index access when using Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-17945: Status: Patch Available (was: Open) > Support column projection for index access when using Parquet Vectorization > --- > > Key: HIVE-17945 > URL: https://issues.apache.org/jira/browse/HIVE-17945 > Project: Hive > Issue Type: Bug >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-17945.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17945) Support column projection for index access when using Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-17945: Attachment: HIVE-17945.patch > Support column projection for index access when using Parquet Vectorization > --- > > Key: HIVE-17945 > URL: https://issues.apache.org/jira/browse/HIVE-17945 > Project: Hive > Issue Type: Bug >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-17945.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17945) Support column projection for index access when using Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-17945: --- > Support column projection for index access when using Parquet Vectorization > --- > > Key: HIVE-17945 > URL: https://issues.apache.org/jira/browse/HIVE-17945 > Project: Hive > Issue Type: Bug >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-17853: --- Target Version/s: 3.0.0, 2.4.0, 2.2.1 Status: Patch Available (was: Open) > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > Attachments: HIVE-17853.01-branch-2.2.patch, > HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch > > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithun}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-17853: --- Attachment: HIVE-17853.01.patch > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > Attachments: HIVE-17853.01-branch-2.2.patch, > HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch > > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithun}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-17853: --- Attachment: HIVE-17853.01-branch-2.patch > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > Attachments: HIVE-17853.01-branch-2.2.patch, > HIVE-17853.01-branch-2.patch > > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithun}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-17853: --- Attachment: HIVE-17853.01-branch-2.2.patch > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > Attachments: HIVE-17853.01-branch-2.2.patch, > HIVE-17853.01-branch-2.patch > > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithun}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params
[ https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226042#comment-16226042 ] Akira Ajisaka commented on HIVE-8937: - Thanks! updated. > fix description of hive.security.authorization.sqlstd.confwhitelist.* params > > > Key: HIVE-8937 > URL: https://issues.apache.org/jira/browse/HIVE-8937 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Thejas M Nair >Assignee: Akira Ajisaka > Labels: TODOC14, TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-8937.001.patch, HIVE-8937.002.patch > > > hive.security.authorization.sqlstd.confwhitelist.* param description in > HiveConf is incorrect. The expected value is a regex, not comma separated > regexes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types
[ https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15552: --- Attachment: HIVE-15552.patch > unable to coalesce DATE and TIMESTAMP types > --- > > Key: HIVE-15552 > URL: https://issues.apache.org/jira/browse/HIVE-15552 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: N Campbell >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Labels: timestamp > Attachments: HIVE-15552.patch > > > COALESCE expression does not expect DATE and TIMESTAMP types > select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from > certtext.tdt > Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 > Argument type mismatch 'cdt': The expressions after COALESCE should all have > the same type: "date" is expected but "timestamp" is found > SQLState: 42000 > ErrorCode: 4 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types
[ https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15552: --- Status: Patch Available (was: In Progress) > unable to coalesce DATE and TIMESTAMP types > --- > > Key: HIVE-15552 > URL: https://issues.apache.org/jira/browse/HIVE-15552 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: N Campbell >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Labels: timestamp > > COALESCE expression does not expect DATE and TIMESTAMP types > select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from > certtext.tdt > Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 > Argument type mismatch 'cdt': The expressions after COALESCE should all have > the same type: "date" is expected but "timestamp" is found > SQLState: 42000 > ErrorCode: 4 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types
[ https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-15552 started by Jesus Camacho Rodriguez. -- > unable to coalesce DATE and TIMESTAMP types > --- > > Key: HIVE-15552 > URL: https://issues.apache.org/jira/browse/HIVE-15552 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: N Campbell >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Labels: timestamp > > COALESCE expression does not expect DATE and TIMESTAMP types > select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from > certtext.tdt > Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 > Argument type mismatch 'cdt': The expressions after COALESCE should all have > the same type: "date" is expected but "timestamp" is found > SQLState: 42000 > ErrorCode: 4 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-15402) LAG's PRECEDING does not work.
[ https://issues.apache.org/jira/browse/HIVE-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi resolved HIVE-15402. -- Resolution: Won't Fix > LAG's PRECEDING does not work. > -- > > Key: HIVE-15402 > URL: https://issues.apache.org/jira/browse/HIVE-15402 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Ryu Kobayashi > > The syntax in the following manual does not work: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-LAGspecifyingalagof3rowsanddefaultvalueof0 > {code} > SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING) > FROM T; > {code} > {code} > FAILED: SemanticException Failed to breakup Windowing invocations into > Groups. At least 1 group must only depend on input columns. Also check for > circular dependencies. > Underlying error: Expecting left window frame boundary for function > LAG((tok_table_or_col a), 3, 0) Window > Spec=[PartitioningSpec=[partitionColumns=[(tok_table_or_col > b)]orderColumns=[(tok_table_or_col c) ASC NULLS_FIRST]]window(start=range(3 > PRECEDING), end=currentRow)] as LAG_window_0 to be unbounded. Found : 3 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15402) LAG's PRECEDING does not work.
[ https://issues.apache.org/jira/browse/HIVE-15402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226000#comment-16226000 ] Ryu Kobayashi commented on HIVE-15402: -- (y) > LAG's PRECEDING does not work. > -- > > Key: HIVE-15402 > URL: https://issues.apache.org/jira/browse/HIVE-15402 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Ryu Kobayashi > > The syntax in the following manual does not work: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-LAGspecifyingalagof3rowsanddefaultvalueof0 > {code} > SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING) > FROM T; > {code} > {code} > FAILED: SemanticException Failed to breakup Windowing invocations into > Groups. At least 1 group must only depend on input columns. Also check for > circular dependencies. > Underlying error: Expecting left window frame boundary for function > LAG((tok_table_or_col a), 3, 0) Window > Spec=[PartitioningSpec=[partitionColumns=[(tok_table_or_col > b)]orderColumns=[(tok_table_or_col c) ASC NULLS_FIRST]]window(start=range(3 > PRECEDING), end=currentRow)] as LAG_window_0 to be unbounded. Found : 3 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17926: - Attachment: HIVE-17926.2.patch > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17926: - Attachment: (was: HIVE-17926.2.patch) > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17926: - Attachment: HIVE-17926.2.patch > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17926: - Attachment: (was: HIVE-17926.2.patch) > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files
[ https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225970#comment-16225970 ] Jason Dere commented on HIVE-17939: --- +1 pending tests > Bucket map join not being selected when bucketed tables is missing bucket > files > --- > > Key: HIVE-17939 > URL: https://issues.apache.org/jira/browse/HIVE-17939 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17939.1.patch > > > Looks like the following logic kicks in during > OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents > the table from being considered a proper bucketed table: > // The number of files for the table should be same as number of > // buckets. > if (fileNames.size() != 0 && fileNames.size() != numBuckets) { > return false; > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions
[ https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17926: - Attachment: HIVE-17926.2.patch Differentiation between pool vs non-pool sessions is not required since both code path will register/unregister session during open/close respectively. Updated patch is more simplified. > Support triggers for non-pool sessions > -- > > Key: HIVE-17926 > URL: https://issues.apache.org/jira/browse/HIVE-17926 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, > HIVE-17926.2.patch > > > Current trigger implementation works only with tez session pools. In case > when tez sessions pools are not used, a new session gets created for every > query in which case trigger validation does not happen. It will be good to > support such one-off session case as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225956#comment-16225956 ] Sergey Shelukhin commented on HIVE-17902: - See the parent JIRA for the description of workload management. We don't have a design doc fully fleshed out yet but we will post one shortly. > add a notions of default pool and unmanaged mapping part 1 > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, > HIVE-17902.02.patch, HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225951#comment-16225951 ] Alexander Kolbasov commented on HIVE-17902: --- This JIRA has a 1-line cryptic text as the description and 152K worth of patches. Is there some design doc elsewhere or some other more detailed description? > add a notions of default pool and unmanaged mapping part 1 > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, > HIVE-17902.02.patch, HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17916) remove ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED
[ https://issues.apache.org/jira/browse/HIVE-17916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225923#comment-16225923 ] Sergey Shelukhin commented on HIVE-17916: - I've already done this somewhere. Rather it's set on by default now > remove ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED > - > > Key: HIVE-17916 > URL: https://issues.apache.org/jira/browse/HIVE-17916 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Teddy Choi > > follow up from HIVE-12631. Filing so it doesn't get lost. > There is this code in UpdateDeleteSemanticAnalyzer > {noformat} > // TODO: remove when this is enabled everywhere > HiveConf.setBoolVar(conf, > ConfVars.HIVE_VECTORIZATION_ROW_IDENTIFIER_ENABLED, true); > {noformat} > The 1st update/delete statement on a session will enable this and it will be > enabled for all future queries which makes this flag useless/misleading. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files
[ https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17939: -- Attachment: HIVE-17939.1.patch [~jdere] [~gopalv] can you please review? > Bucket map join not being selected when bucketed tables is missing bucket > files > --- > > Key: HIVE-17939 > URL: https://issues.apache.org/jira/browse/HIVE-17939 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17939.1.patch > > > Looks like the following logic kicks in during > OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents > the table from being considered a proper bucketed table: > // The number of files for the table should be same as number of > // buckets. > if (fileNames.size() != 0 && fileNames.size() != numBuckets) { > return false; > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files
[ https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17939: -- Status: Patch Available (was: In Progress) > Bucket map join not being selected when bucketed tables is missing bucket > files > --- > > Key: HIVE-17939 > URL: https://issues.apache.org/jira/browse/HIVE-17939 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Looks like the following logic kicks in during > OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents > the table from being considered a proper bucketed table: > // The number of files for the table should be same as number of > // buckets. > if (fileNames.size() != 0 && fileNames.size() != numBuckets) { > return false; > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.12.patch > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, > HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, > HIVE-17458.11.patch, HIVE-17458.12.patch, HIVE-17458.12.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17928) branch-2.3 does not compile due to using incorrect storage-api version
[ https://issues.apache.org/jira/browse/HIVE-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-17928: --- Resolution: Fixed Fix Version/s: 2.3.2 Status: Resolved (was: Patch Available) Thanks [~prasanth_j]. I committed to branch-2.3 > branch-2.3 does not compile due to using incorrect storage-api version > -- > > Key: HIVE-17928 > URL: https://issues.apache.org/jira/browse/HIVE-17928 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.2 >Reporter: Sergio Peña >Assignee: Sergio Peña > Fix For: 2.3.2 > > Attachments: HIVE-17928.1-branch-2.3.patch, > HIVE-17928.1-branch2.3.patch > > > This is the error when building branch-2.3 > {noformat} > [INFO] Reactor Summary: > [INFO] > [INFO] Hive ... SUCCESS [ 1.401 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.091 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 2.299 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 1.026 > s] > [INFO] Hive Shims . SUCCESS [ 0.560 > s] > [INFO] Hive Common FAILURE [ 0.090 > s] > [INFO] Hive Service RPC ... SKIPPED > [INFO] Hive Serde . SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Spark Remote Client SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 8.991 s > [INFO] Finished at: 2017-10-28T12:33:51-05:00 > [INFO] Final Memory: 67M/975M > [INFO] > > [ERROR] Failed to execute goal on project hive-common: Could not resolve > dependencies for project org.apache.hive:hive-common:jar:2.3.2-SNAPSHOT: > Could not find artifact org.apache.hive:hive-storage-api:jar:2.3.2-SNAPSHOT > -> [Help 1] > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17902: Attachment: HIVE-17902.02.patch > add a notions of default pool and unmanaged mapping part 1 > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, > HIVE-17902.02.patch, HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.12.patch patch 12 should fix all the llap errors > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, > HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, > HIVE-17458.11.patch, HIVE-17458.12.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17902: Attachment: HIVE-17902.02.patch Rebasing and making sure that default is not a reserved keyword. [~harishjp] w.r.t. the non-reserved keyword list that you pointed out, I think it makes sense to add all the newly added keywords to that collection as part of the next patch... I added "default" as a keyword, reserved by default, and everything broke :) > add a notions of default pool and unmanaged mapping part 1 > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, > HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225804#comment-16225804 ] Sergey Shelukhin commented on HIVE-17458: - [~ekoifman] can you post a RB? > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, > HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, > HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, > HIVE-17458.11.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17928) branch-2.3 does not compile due to using incorrect storage-api version
[ https://issues.apache.org/jira/browse/HIVE-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225800#comment-16225800 ] Prasanth Jayachandran commented on HIVE-17928: -- +1 > branch-2.3 does not compile due to using incorrect storage-api version > -- > > Key: HIVE-17928 > URL: https://issues.apache.org/jira/browse/HIVE-17928 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.2 >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-17928.1-branch-2.3.patch, > HIVE-17928.1-branch2.3.patch > > > This is the error when building branch-2.3 > {noformat} > [INFO] Reactor Summary: > [INFO] > [INFO] Hive ... SUCCESS [ 1.401 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.091 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 2.299 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 1.026 > s] > [INFO] Hive Shims . SUCCESS [ 0.560 > s] > [INFO] Hive Common FAILURE [ 0.090 > s] > [INFO] Hive Service RPC ... SKIPPED > [INFO] Hive Serde . SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Spark Remote Client SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 8.991 s > [INFO] Finished at: 2017-10-28T12:33:51-05:00 > [INFO] Final Memory: 67M/975M > [INFO] > > [ERROR] Failed to execute goal on project hive-common: Could not resolve > dependencies for project org.apache.hive:hive-common:jar:2.3.2-SNAPSHOT: > Could not find artifact org.apache.hive:hive-storage-api:jar:2.3.2-SNAPSHOT > -> [Help 1] > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1
[ https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225795#comment-16225795 ] Sergey Shelukhin commented on HIVE-17902: - Adding default as keyword appears to break everything... hmm > add a notions of default pool and unmanaged mapping part 1 > -- > > Key: HIVE-17902 > URL: https://issues.apache.org/jira/browse/HIVE-17902 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17902.01.patch, HIVE-17902.patch > > > This is needed to map queries between WM and non-WM execution -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225773#comment-16225773 ] Sergey Shelukhin commented on HIVE-17940: - +1... not sure if tests are set up for branch-1 or how stale they are, might be worth it to wait for HiveQA > IllegalArgumentException when reading last row-group in an ORC stripe > - > > Key: HIVE-17940 > URL: https://issues.apache.org/jira/browse/HIVE-17940 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.3.0, 1.2.2 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17940.1-branch-1.2.patch, > HIVE-17940.1-branch-1.patch > > > (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) > When the last row-group in an ORC stripe contains fewer records than > specified in {{$\{orc.row.index.stride\}}}, and if a column value is sparse > (i.e. mostly nulls), then one sees the following failure when reading the ORC > stripe: > {noformat} > java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA > to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for > column 82 kind DATA to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) > ... 14 more > {noformat} > [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running > into this in production with {{branch-1}}+, we find that the fix for > HIVE-10024 sorts this out in {{branch-1}} as well. > This is a fairly rare case, but it leads to bad reads on valid ORC files. I > will back-port this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17841) implement applying the resource plan
[ https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17841: Attachment: HIVE-17841.05.patch Rebased on top of some recent changes on master > implement applying the resource plan > > > Key: HIVE-17841 > URL: https://issues.apache.org/jira/browse/HIVE-17841 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, > HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, > HIVE-17841.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17940: Attachment: HIVE-17940.1-branch-1.2.patch > IllegalArgumentException when reading last row-group in an ORC stripe > - > > Key: HIVE-17940 > URL: https://issues.apache.org/jira/browse/HIVE-17940 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.3.0, 1.2.2 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17940.1-branch-1.2.patch, > HIVE-17940.1-branch-1.patch > > > (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) > When the last row-group in an ORC stripe contains fewer records than > specified in {{$\{orc.row.index.stride\}}}, and if a column value is sparse > (i.e. mostly nulls), then one sees the following failure when reading the ORC > stripe: > {noformat} > java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA > to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for > column 82 kind DATA to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) > ... 14 more > {noformat} > [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running > into this in production with {{branch-1}}+, we find that the fix for > HIVE-10024 sorts this out in {{branch-1}} as well. > This is a fairly rare case, but it leads to bad reads on valid ORC files. I > will back-port this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17940: Status: Patch Available (was: Open) > IllegalArgumentException when reading last row-group in an ORC stripe > - > > Key: HIVE-17940 > URL: https://issues.apache.org/jira/browse/HIVE-17940 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.2, 1.3.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17940.1-branch-1.patch > > > (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) > When the last row-group in an ORC stripe contains fewer records than > specified in {{$\{orc.row.index.stride\}}}, and if a column value is sparse > (i.e. mostly nulls), then one sees the following failure when reading the ORC > stripe: > {noformat} > java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA > to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for > column 82 kind DATA to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) > ... 14 more > {noformat} > [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running > into this in production with {{branch-1}}+, we find that the fix for > HIVE-10024 sorts this out in {{branch-1}} as well. > This is a fairly rare case, but it leads to bad reads on valid ORC files. I > will back-port this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17940: Attachment: HIVE-17940.1-branch-1.patch > IllegalArgumentException when reading last row-group in an ORC stripe > - > > Key: HIVE-17940 > URL: https://issues.apache.org/jira/browse/HIVE-17940 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.3.0, 1.2.2 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17940.1-branch-1.patch > > > (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) > When the last row-group in an ORC stripe contains fewer records than > specified in {{$\{orc.row.index.stride\}}}, and if a column value is sparse > (i.e. mostly nulls), then one sees the following failure when reading the ORC > stripe: > {noformat} > java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA > to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for > column 82 kind DATA to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) > ... 14 more > {noformat} > [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running > into this in production with {{branch-1}}+, we find that the fix for > HIVE-10024 sorts this out in {{branch-1}} as well. > This is a fairly rare case, but it leads to bad reads on valid ORC files. I > will back-port this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17942) HiveAlterHandler not using conf from threadlocal
[ https://issues.apache.org/jira/browse/HIVE-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Janaki Lahorani reassigned HIVE-17942: -- Assignee: Janaki Lahorani > HiveAlterHandler not using conf from threadlocal > > > Key: HIVE-17942 > URL: https://issues.apache.org/jira/browse/HIVE-17942 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani > > When HiveAlterHandler looks for conf, it is not getting the one from thread > local. So, local changes are not visible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17766) Support non-equi LEFT SEMI JOIN
[ https://issues.apache.org/jira/browse/HIVE-17766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17766: --- Attachment: HIVE-17766.05.patch > Support non-equi LEFT SEMI JOIN > --- > > Key: HIVE-17766 > URL: https://issues.apache.org/jira/browse/HIVE-17766 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17766.01.patch, HIVE-17766.02.patch, > HIVE-17766.03.patch, HIVE-17766.04.patch, HIVE-17766.05.patch, > HIVE-17766.patch > > > Currently we get an error like {noformat}Non equality condition not supported > in Semi-Join{noformat} > This is required to generate better plan for EXISTS/IN correlated subquery > where such queries are transformed into LEFT SEMI JOIN. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17924) Restore SerDe by reverting HIVE-15167 to unbreak API compatibility
[ https://issues.apache.org/jira/browse/HIVE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225752#comment-16225752 ] Sergey Shelukhin commented on HIVE-17924: - Btw, that patch added a method to SerDe with a default implementation; to duplicate this functionality one would either need to add a method to the interface, which is actually more disruptive and to more people (right now anyone who is using AbstractSerDe is not affected); or adding bunch of instanceof-s everywhere where SerDe and not AbstractSerDe will be used, which is an ugly hack that amply justifies the API change as described above (deprecated for years over 2 major versions, one-line change to fix for some users) > Restore SerDe by reverting HIVE-15167 to unbreak API compatibility > -- > > Key: HIVE-17924 > URL: https://issues.apache.org/jira/browse/HIVE-17924 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.0, 2.3.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > > HIVE-15167 broke compatibility badly for very little gain and caused a lot of > pain for our users. We should revert it and restore the SerDe interface. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17924) Restore SerDe by reverting HIVE-15167 to unbreak API compatibility
[ https://issues.apache.org/jira/browse/HIVE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225746#comment-16225746 ] Sergey Shelukhin commented on HIVE-17924: - It's an interface that serves no useful purpose... APIs change, esp. between major versions (this was deprecated in 0.NN and removed in 2.3). The code change for users is expected over such a long timeframe and is a trivial one-line change. I don't see a specific use case for restoring it. > Restore SerDe by reverting HIVE-15167 to unbreak API compatibility > -- > > Key: HIVE-17924 > URL: https://issues.apache.org/jira/browse/HIVE-17924 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.0, 2.3.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > > HIVE-15167 broke compatibility badly for very little gain and caused a lot of > pain for our users. We should revert it and restore the SerDe interface. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1
[ https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225725#comment-16225725 ] Prasanth Jayachandran commented on HIVE-15016: -- All TestAcidOnTez tests are failing. cc/ [~ekoifman] > Run tests with Hadoop 3.0.0-beta1 > - > > Key: HIVE-15016 > URL: https://issues.apache.org/jira/browse/HIVE-15016 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.0.0 >Reporter: Sergio Peña >Assignee: Aihua Xu > Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, > HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, > HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.patch, > Hadoop3Upstream.patch > > > Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run > tests against this new version before GA. > We should start running tests with Hive to validate compatibility against > Hadoop 3.0. > NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 > GA is released. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17924) Restore SerDe by reverting HIVE-15167 to unbreak API compatibility
[ https://issues.apache.org/jira/browse/HIVE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225726#comment-16225726 ] Owen O'Malley commented on HIVE-17924: -- The reason for keeping it is because removing it is painful for our users. For an API breaking change there should be a very good reason. I believe the right fix is removing the deprecation. > Restore SerDe by reverting HIVE-15167 to unbreak API compatibility > -- > > Key: HIVE-17924 > URL: https://issues.apache.org/jira/browse/HIVE-17924 > Project: Hive > Issue Type: Bug >Affects Versions: 2.3.0, 2.3.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > > HIVE-15167 broke compatibility badly for very little gain and caused a lot of > pain for our users. We should revert it and restore the SerDe interface. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17940: Description: (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) When the last row-group in an ORC stripe contains fewer records than specified in {{$\{orc.row.index.stride\}}}, and if a column value is sparse (i.e. mostly nulls), then one sees the following failure when reading the ORC stripe: {noformat} java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more {noformat} [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running into this in production with {{branch-1}}+, we find that the fix for HIVE-10024 sorts this out in {{branch-1}} as well. This is a fairly rare case, but it leads to bad reads on valid ORC files. I will back-port this shortly. was: (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) When the last row-group in an ORC stripe contains fewer records than specified in {{\$\{orc.row.index.stride\}}}, and if a column value is sparse (i.e. mostly nulls), then one sees the following failure when reading the ORC stripe: {noformat} java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more {noformat} [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running into this in production with {{branch-1}}+, we find that the fix for HIVE-10024 sorts this out in
[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225708#comment-16225708 ] Andrew Sherman commented on HIVE-17826: --- Thank you for the review and push > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Fix For: 3.0.0 > > Attachments: HIVE-17826.1.patch > > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Stream Closed > at java.io.RandomAccessFile.writeBytes(Native Method) > at java.io.RandomAccessFile.write(RandomAccessFile.java:525) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:111) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe
[ https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17940: --- > IllegalArgumentException when reading last row-group in an ORC stripe > - > > Key: HIVE-17940 > URL: https://issues.apache.org/jira/browse/HIVE-17940 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 1.2.2, 1.3.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > > (This is a backport of HIVE-10024 to {{branch-1.2}}, and {{branch-1}}.) > When the last row-group in an ORC stripe contains fewer records than > specified in {{\$\{orc.row.index.stride\}}}, and if a column value is sparse > (i.e. mostly nulls), then one sees the following failure when reading the ORC > stripe: > {noformat} > java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA > to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for > column 82 kind DATA to 130 is outside of the data > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) > ... 14 more > {noformat} > [~sershe] had a fix for this in HIVE-10024, in {{branch-2}}. After running > into this in production with {{branch-1}}+, we find that the fix for > HIVE-10024 sorts this out in {{branch-1}} as well. > This is a fairly rare case, but it leads to bad reads on valid ORC files. I > will back-port this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-17826: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks [~asherman] for the work. > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Fix For: 3.0.0 > > Attachments: HIVE-17826.1.patch > > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Stream Closed > at java.io.RandomAccessFile.writeBytes(Native Method) > at java.io.RandomAccessFile.write(RandomAccessFile.java:525) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:111) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17933) make antlr output directory to use a top-level sourceset
[ https://issues.apache.org/jira/browse/HIVE-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17933: Issue Type: Sub-task (was: Improvement) Parent: HIVE-17159 > make antlr output directory to use a top-level sourceset > > > Key: HIVE-17933 > URL: https://issues.apache.org/jira/browse/HIVE-17933 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17933.02.patch > > > by default antlr generates to: > ${project.build.directory}/generated-sources/antlr3 > but currently standalone-metastore adds > ${project.build.directory}/generated-sources to the sources list; because it > has protobuf also. > I guess not every ide is picky about this; but eclipse does show some errors > because of this -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17933) make antlr output directory to use a top-level sourceset
[ https://issues.apache.org/jira/browse/HIVE-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17933: Attachment: (was: HIVE-17933.01.patch) > make antlr output directory to use a top-level sourceset > > > Key: HIVE-17933 > URL: https://issues.apache.org/jira/browse/HIVE-17933 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17933.02.patch > > > by default antlr generates to: > ${project.build.directory}/generated-sources/antlr3 > but currently standalone-metastore adds > ${project.build.directory}/generated-sources to the sources list; because it > has protobuf also. > I guess not every ide is picky about this; but eclipse does show some errors > because of this -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17933) make antlr output directory to use a top-level sourceset
[ https://issues.apache.org/jira/browse/HIVE-17933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17933: Attachment: HIVE-17933.02.patch #2) * fix outputdir * also move version scripts output to src/gen/version instead src/gen; src/gen/org caused a similar issue > make antlr output directory to use a top-level sourceset > > > Key: HIVE-17933 > URL: https://issues.apache.org/jira/browse/HIVE-17933 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17933.02.patch > > > by default antlr generates to: > ${project.build.directory}/generated-sources/antlr3 > but currently standalone-metastore adds > ${project.build.directory}/generated-sources to the sources list; because it > has protobuf also. > I guess not every ide is picky about this; but eclipse does show some errors > because of this -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17783) Hybrid Grace Hash Join has performance degradation for N-way join using Hive on Tez
[ https://issues.apache.org/jira/browse/HIVE-17783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225678#comment-16225678 ] Wei Zheng commented on HIVE-17783: -- [~Ferd] Sorry for the late reply. Yes the spilling part is the bottleneck and there's no easy way to get around it. In your case for the n-way joins, the optimizer stats estimation may not be accurate which makes the situation worse. Anyway, the ultimate way to solve this problem is to have a reliable memory manager which can provide memory usage/quota at any moment. Right now we're following a conservative approach, which is to use a soft (possibly inaccurate) memory limit. That way we can avoid unnecessary spilling if there is enough memory for loading the hashtable. > Hybrid Grace Hash Join has performance degradation for N-way join using Hive > on Tez > --- > > Key: HIVE-17783 > URL: https://issues.apache.org/jira/browse/HIVE-17783 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: 8*Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz > 1 master + 7 workers > TPC-DS at 3TB data scales > Hive version : 2.2.0 >Reporter: Ferdinand Xu > Attachments: Hybrid_Grace_Hash_Join.xlsx, screenshot-1.png > > > Most configurations are using default value. And the benchmark is to test > enabling against disabling hybrid grace hash join using TPC-DS queries at 3TB > data scales. Many queries related to N-way join has performance degradation > over three times test. Detailed result is attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16736) General Improvements to BufferedRows
[ https://issues.apache.org/jira/browse/HIVE-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225677#comment-16225677 ] Hive QA commented on HIVE-16736: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894766/HIVE-16736.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11327 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=62) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=155) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=172) org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver (batchId=92) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=93) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=205) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=222) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7561/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7561/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7561/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12894766 - PreCommit-HIVE-Build > General Improvements to BufferedRows > > > Key: HIVE-16736 > URL: https://issues.apache.org/jira/browse/HIVE-16736 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-16736.1.patch, HIVE-16736.1.patch > > > General improvements for {{BufferedRows.java}}. Use {{ArrayList}} instead of > {{LinkedList}} to conserve memory for large data sets, prevent having to loop > through the entire data set twice in {{normalizeWidths}} method, some > simplifications. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17918) NPE during semijoin reduction optimization when LLAP caching disabled
[ https://issues.apache.org/jira/browse/HIVE-17918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17918: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > NPE during semijoin reduction optimization when LLAP caching disabled > - > > Key: HIVE-17918 > URL: https://issues.apache.org/jira/browse/HIVE-17918 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere > Fix For: 3.0.0 > > Attachments: HIVE-17918.1.patch, HIVE-17918.2.patch > > > DynamicValue (used by semijoin reduction optimization) relies on the > ObjectCache. If LLAP cache is disabled then the DynamicValue is broken in > LLAP: > {noformat} > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:283) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:237) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:254) > ... 15 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:928) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) > ... 18 more > Caused by: java.lang.IllegalStateException: Failed to retrieve dynamic value > for RS_25_household_demographics_hd_demo_sk_min > at > org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterLongColumnBetweenDynamicValue.evaluate(FilterLongColumnBetweenDynamicValue.java:80) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:39) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.FilterExprAndExpr.evaluate(FilterExprAndExpr.java:41) > at > org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:112) > at > org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:959) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:907) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:137) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:828) > ... 19 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:61) > at > org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:50) > at > org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieve(ObjectCacheWrapper.java:40) > at
[jira] [Commented] (HIVE-10024) LLAP: q file test is broken again
[ https://issues.apache.org/jira/browse/HIVE-10024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225670#comment-16225670 ] Mithun Radhakrishnan commented on HIVE-10024: - bq. this was a branch fix so I didn't think it affected a released version. Right, that makes sense. We figured as much from the {{Fix version/s}}. I thought I'd include the stack-trace here, for anyone who might have run into a similar problem. bq. Feel free to backport (on separate jira given its age ;)) Roger that. I'll port this back to {{branch-1}}, and {{branch-1.2}}. Thanks for this fix, [~sershe]! > LLAP: q file test is broken again > - > > Key: HIVE-10024 > URL: https://issues.apache.org/jira/browse/HIVE-10024 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17766) Support non-equi LEFT SEMI JOIN
[ https://issues.apache.org/jira/browse/HIVE-17766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225668#comment-16225668 ] Ashutosh Chauhan commented on HIVE-17766: - +1 pending tests > Support non-equi LEFT SEMI JOIN > --- > > Key: HIVE-17766 > URL: https://issues.apache.org/jira/browse/HIVE-17766 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17766.01.patch, HIVE-17766.02.patch, > HIVE-17766.03.patch, HIVE-17766.04.patch, HIVE-17766.patch > > > Currently we get an error like {noformat}Non equality condition not supported > in Semi-Join{noformat} > This is required to generate better plan for EXISTS/IN correlated subquery > where such queries are transformed into LEFT SEMI JOIN. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
[ https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-17911: --- Status: Patch Available (was: Open) > org.apache.hadoop.hive.metastore.ObjectStore - Tune Up > -- > > Key: HIVE-17911 > URL: https://issues.apache.org/jira/browse/HIVE-17911 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-17911.1.patch, HIVE-17911.2.patch > > > # Remove unused variables > # Add logging parameterization > # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection > empty check (and always use null check) > # Minor tweaks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17834) Fix flaky triggers test
[ https://issues.apache.org/jira/browse/HIVE-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17834: - Attachment: HIVE-17834.5.patch The previous patches added sleep to map tasks to give some time for publishing SHUFFLE_BYTES and validation of triggers. It looks like TaskCounters are published only after task completion. There was not much time between last 2 reducers for trigger validation, hence the flakiness. Introducing some more reduce stages, and slowing down final aggregation so that previous reduce stage's SHUFFLE_BYTES counter gets published and validated by the final aggregation completes. Tested this patch several times on mac and centos (relatively slow) and it seems to work fine this time. Will try couple more times on precommit before committing this patch. > Fix flaky triggers test > --- > > Key: HIVE-17834 > URL: https://issues.apache.org/jira/browse/HIVE-17834 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-17834.1.patch, HIVE-17834.2.patch, > HIVE-17834.3.patch, HIVE-17834.4.patch, HIVE-17834.4.patch, HIVE-17834.5.patch > > > https://issues.apache.org/jira/browse/HIVE-12631?focusedCommentId=16209803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16209803 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests
[ https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17908: -- Attachment: HIVE-17908.2.patch Pre-commit tests never ran. Re-attaching patch. > LLAP External client not correctly handling killTask for pending requests > - > > Key: HIVE-17908 > URL: https://issues.apache.org/jira/browse/HIVE-17908 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch > > > Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP > external client. > HIVE-17393 fixed some of these errors, however it is also occurring because > the client is not correctly handling the killTask notification when the > request is accepted but still waiting for the first task heartbeat. In this > situation the client should retry the request, similar to what the LLAP AM > does. Current logic is ignoring the killTask in this situation, which results > in a heartbeat timeout - no heartbeats are sent by LLAP because of the > killTask notification. > {noformat} > 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, > cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received > reader event error: Timed out waiting for heartbeat for task ID > attempt_7739111832518812959_0005_0_00_10_0 > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: > LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0): > Error while attempting to read chunk length > at > org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read(BufferedInputStream.java:265) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142) > ... 22 more > Caused by: java.net.SocketException: Socket closed > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests
[ https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17908: -- Status: Patch Available (was: Open) > LLAP External client not correctly handling killTask for pending requests > - > > Key: HIVE-17908 > URL: https://issues.apache.org/jira/browse/HIVE-17908 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch > > > Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP > external client. > HIVE-17393 fixed some of these errors, however it is also occurring because > the client is not correctly handling the killTask notification when the > request is accepted but still waiting for the first task heartbeat. In this > situation the client should retry the request, similar to what the LLAP AM > does. Current logic is ignoring the killTask in this situation, which results > in a heartbeat timeout - no heartbeats are sent by LLAP because of the > killTask notification. > {noformat} > 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, > cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received > reader event error: Timed out waiting for heartbeat for task ID > attempt_7739111832518812959_0005_0_00_10_0 > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: > LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0): > Error while attempting to read chunk length > at > org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read(BufferedInputStream.java:265) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142) > ... 22 more > Caused by: java.net.SocketException: Socket closed > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests
[ https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17908: -- Status: Open (was: Patch Available) > LLAP External client not correctly handling killTask for pending requests > - > > Key: HIVE-17908 > URL: https://issues.apache.org/jira/browse/HIVE-17908 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch > > > Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP > external client. > HIVE-17393 fixed some of these errors, however it is also occurring because > the client is not correctly handling the killTask notification when the > request is accepted but still waiting for the first task heartbeat. In this > situation the client should retry the request, similar to what the LLAP AM > does. Current logic is ignoring the killTask in this situation, which results > in a heartbeat timeout - no heartbeats are sent by LLAP because of the > killTask notification. > {noformat} > 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, > cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received > reader event error: Timed out waiting for heartbeat for task ID > attempt_7739111832518812959_0005_0_00_10_0 > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121) > at > org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266) > at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211) > at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: > LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0): > Error while attempting to read chunk length > at > org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read(BufferedInputStream.java:265) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267) > at > org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142) > ... 22 more > Caused by: java.net.SocketException: Socket closed > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17874: --- Attachment: HIVE-17874.07-branch-2.patch attaching branch-2 patch. > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Fix For: 3.0.0 > > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch, HIVE-17874.03.patch, HIVE-17874.04.patch, > HIVE-17874.05.patch, HIVE-17874.06.patch, HIVE-17874.07-branch-2.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17635) Add unit tests to CompactionTxnHandler and use PreparedStatements for queries
[ https://issues.apache.org/jira/browse/HIVE-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225602#comment-16225602 ] Sahil Takiar commented on HIVE-17635: - +1 > Add unit tests to CompactionTxnHandler and use PreparedStatements for queries > - > > Key: HIVE-17635 > URL: https://issues.apache.org/jira/browse/HIVE-17635 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-17635.1.patch, HIVE-17635.2.patch, > HIVE-17635.3.patch, HIVE-17635.4.patch, HIVE-17635.6.patch > > > It is better for jdbc code that runs against the HMS database to use > PreparedStatements. Convert CompactionTxnHandler queries to use > PreparedStatement and add tests to TestCompactionTxnHandler to test these > queries, and improve code coverage. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17750) add a flag to automatically create most tables as MM
[ https://issues.apache.org/jira/browse/HIVE-17750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17750: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the reviews! > add a flag to automatically create most tables as MM > - > > Key: HIVE-17750 > URL: https://issues.apache.org/jira/browse/HIVE-17750 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-17750.01.patch, HIVE-17750.patch > > > After merge we are going to do another round of gap identification... similar > to HIVE-14990. > However the approach used there is a huge PITA. It'd be much better to make > tables MM by default at create time, not pretend they are MM at check time, > from the perspective of spurious error elimination. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17748) ReplCopyTask doesn't support multi-file CopyWork
[ https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17748: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > ReplCopyTask doesn't support multi-file CopyWork > > > Key: HIVE-17748 > URL: https://issues.apache.org/jira/browse/HIVE-17748 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-17748.01.patch, HIVE-17748.patch > > > has > {noformat} > Path fromPath = work.getFromPaths()[0]; > toPath = work.getToPaths()[0]; > {noformat} > should this throw if from/to paths have > 1 element? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17858) MM - some union cases are broken
[ https://issues.apache.org/jira/browse/HIVE-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17858: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > MM - some union cases are broken > > > Key: HIVE-17858 > URL: https://issues.apache.org/jira/browse/HIVE-17858 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: mm-gap-1 > Fix For: 3.0.0 > > Attachments: HIVE-17858.01.patch, HIVE-17858.02.patch, > HIVE-17858.patch > > > mm_all test no longer runs on LLAP; if it's executed in LLAP, one can see > that some union cases no longer work. > Queries on partunion_mm, skew_dp_union_mm produce no results. > I'm not sure what part of "integration" broke it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0
[ https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17698: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > FileSinkDesk.getMergeInputDirName() uses stmtId=0 > - > > Key: HIVE-17698 > URL: https://issues.apache.org/jira/browse/HIVE-17698 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-17698.01.patch, HIVE-17698.02.patch, > HIVE-17698.patch, HIVE-17698.patch > > > this is certainly wrong for multi statement txn but may also affect writes > from Union All queries if these are made to follow full Acid convention > _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17884) Implement create, alter and drop workload management triggers
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17884: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the patch and reviews! > Implement create, alter and drop workload management triggers > - > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Fix For: 3.0.0 > > Attachments: HIVE-17884.01.patch, HIVE-17884.02.patch, > HIVE-17884.03.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files
[ https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17939 started by Deepak Jaiswal. - > Bucket map join not being selected when bucketed tables is missing bucket > files > --- > > Key: HIVE-17939 > URL: https://issues.apache.org/jira/browse/HIVE-17939 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Looks like the following logic kicks in during > OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents > the table from being considered a proper bucketed table: > // The number of files for the table should be same as number of > // buckets. > if (fileNames.size() != 0 && fileNames.size() != numBuckets) { > return false; > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17938) Enable parallel query compilation in HS2
[ https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-17938: - Attachment: HIVE-17938.2.patch 2.patch - fixed description > Enable parallel query compilation in HS2 > > > Key: HIVE-17938 > URL: https://issues.apache.org/jira/browse/HIVE-17938 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-17938.1.patch, HIVE-17938.2.patch > > > This (hive.driver.parallel.compilation) has been enabled in many production > environments for a while (Hortonworks customers), and it has been stable. > Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files
[ https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-17939: - > Bucket map join not being selected when bucketed tables is missing bucket > files > --- > > Key: HIVE-17939 > URL: https://issues.apache.org/jira/browse/HIVE-17939 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Looks like the following logic kicks in during > OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents > the table from being considered a proper bucketed table: > // The number of files for the table should be same as number of > // buckets. > if (fileNames.size() != 0 && fileNames.size() != numBuckets) { > return false; > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches
[ https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225573#comment-16225573 ] Deepak Jaiswal commented on HIVE-17936: --- [~jdere] [~ashutoshc] can you please review? > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > > Key: HIVE-17936 > URL: https://issues.apache.org/jira/browse/HIVE-17936 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17936.1.patch > > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17884) Implement create, alter and drop workload management triggers
[ https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17884: Summary: Implement create, alter and drop workload management triggers (was: Implement create, alter and drop workload management triggers.) > Implement create, alter and drop workload management triggers > - > > Key: HIVE-17884 > URL: https://issues.apache.org/jira/browse/HIVE-17884 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17884.01.patch, HIVE-17884.02.patch, > HIVE-17884.03.patch > > > Implement triggers for workload management: > The commands to be implemented: > CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action; > condition is a boolean expression: variable operator value types with 'AND' > and 'OR' support. > action is currently: KILL or MOVE TO pool; > ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action; > DROP TRIGGER `plan_name`.`trigger_name`; > Also add WM_TRIGGERS to information schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17938) Enable parallel query compilation in HS2
[ https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225563#comment-16225563 ] Sergey Shelukhin commented on HIVE-17938: - nit: the comment is still saying that the default is false. I think this part about the default value should be removed altogether. Can be fixed on commit; +1 pending tests > Enable parallel query compilation in HS2 > > > Key: HIVE-17938 > URL: https://issues.apache.org/jira/browse/HIVE-17938 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-17938.1.patch > > > This (hive.driver.parallel.compilation) has been enabled in many production > environments for a while (Hortonworks customers), and it has been stable. > Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-10024) LLAP: q file test is broken again
[ https://issues.apache.org/jira/browse/HIVE-10024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225556#comment-16225556 ] Sergey Shelukhin edited comment on HIVE-10024 at 10/30/17 7:03 PM: --- Sorry, this was a branch fix so I didn't think it affected a released version. Feel free to backport (on separate jira given its age ;)) was (Author: sershe): Sorry, this was a branch fix so I didn't think it affected the a released version. Feel free to backport (on separate jira given its age ;)) > LLAP: q file test is broken again > - > > Key: HIVE-10024 > URL: https://issues.apache.org/jira/browse/HIVE-10024 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-10024) LLAP: q file test is broken again
[ https://issues.apache.org/jira/browse/HIVE-10024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225556#comment-16225556 ] Sergey Shelukhin commented on HIVE-10024: - Sorry, this was a branch fix so I didn't think it affected the a released version. Feel free to backport (on separate jira given its age ;)) > LLAP: q file test is broken again > - > > Key: HIVE-10024 > URL: https://issues.apache.org/jira/browse/HIVE-10024 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches
[ https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17936: -- Attachment: HIVE-17936.1.patch Added the config and a test for it along with the fix. > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > > Key: HIVE-17936 > URL: https://issues.apache.org/jira/browse/HIVE-17936 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17936.1.patch > > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches
[ https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17936: -- Status: Patch Available (was: In Progress) > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > > Key: HIVE-17936 > URL: https://issues.apache.org/jira/browse/HIVE-17936 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches
[ https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17936 started by Deepak Jaiswal. - > Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin > branches > > > Key: HIVE-17936 > URL: https://issues.apache.org/jira/browse/HIVE-17936 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not > have equality as there is a chance that the values are same on both sides and > the branch is still marked as good when it shouldn't be. > Add a configurable factor to see how useful this is if nDVs on smaller side > are only slightly less than that on TS side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17938) Enable parallel query compilation in HS2
[ https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225544#comment-16225544 ] Thejas M Nair commented on HIVE-17938: -- [~sershe] can you please review ? > Enable parallel query compilation in HS2 > > > Key: HIVE-17938 > URL: https://issues.apache.org/jira/browse/HIVE-17938 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-17938.1.patch > > > This (hive.driver.parallel.compilation) has been enabled in many production > environments for a while (Hortonworks customers), and it has been stable. > Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17938) Enable parallel query compilation in HS2
[ https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-17938: - Attachment: HIVE-17938.1.patch > Enable parallel query compilation in HS2 > > > Key: HIVE-17938 > URL: https://issues.apache.org/jira/browse/HIVE-17938 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-17938.1.patch > > > This (hive.driver.parallel.compilation) has been enabled in many production > environments for a while (Hortonworks customers), and it has been stable. > Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17765) expose Hive keywords
[ https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225541#comment-16225541 ] Sergey Shelukhin commented on HIVE-17765: - The original API is where the keywords are added, get info one that can also return product name, version and other assorted stuff. It's aimed at JDBC/ODBC driver developers, and is also somewhat standard so it doesn't make sense to document this change without having the whole thing documented. > expose Hive keywords > - > > Key: HIVE-17765 > URL: https://issues.apache.org/jira/browse/HIVE-17765 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, > HIVE-17765.03.patch, HIVE-17765.nogen.patch, HIVE-17765.patch > > > This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on > SQL capabilities of Hive -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17938) Enable parallel query compilation in HS2
[ https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reassigned HIVE-17938: > Enable parallel query compilation in HS2 > > > Key: HIVE-17938 > URL: https://issues.apache.org/jira/browse/HIVE-17938 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > > This (hive.driver.parallel.compilation) has been enabled in many production > environments for a while (Hortonworks customers), and it has been stable. > Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-10024) LLAP: q file test is broken again
[ https://issues.apache.org/jira/browse/HIVE-10024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225540#comment-16225540 ] Mithun Radhakrishnan edited comment on HIVE-10024 at 10/30/17 6:53 PM: --- (I have put this off long enough. Sorry for the delay in updating this JIRA. We ran into this a good while back in production.) The sparse summary and description belie the significance of this fix. It turns out that this wasn't a fix for test-failures at all. Before this fix, in cases where the last row-group for an ORC stripe contained fewer records than {{$\{orc.row.index.stride\}}}, and when predicate pushdown is enabled, one sees the following sort of failure: {noformat} java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more {noformat} It's a fairly rare case, but leads to bad reads on valid ORC files. This fix is available in {{branch-2}} and forward, but not in {{branch-1}}. was (Author: mithun): I have put this off long enough. The sparse summary and description belie the significance of this fix. It turns out that this wasn't a fix for test-failures at all. Before this fix, in cases where the last row-group for an ORC stripe contained fewer records than {{$\{orc.row.index.stride\}}}, and when predicate pushdown is enabled, one sees the following sort of failure: {noformat} java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more {noformat}
[jira] [Commented] (HIVE-10024) LLAP: q file test is broken again
[ https://issues.apache.org/jira/browse/HIVE-10024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225540#comment-16225540 ] Mithun Radhakrishnan commented on HIVE-10024: - I have put this off long enough. The sparse summary and description belie the significance of this fix. It turns out that this wasn't a fix for test-failures at all. Before this fix, in cases where the last row-group for an ORC stripe contained fewer records than {{$\{orc.row.index.stride\}}}, and when predicate pushdown is enabled, one sees the following sort of failure: {noformat} java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1738) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 82 kind DATA to 130 is outside of the data at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:322) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more {noformat} It's a fairly rare case, but leads to bad reads on valid ORC files. This fix is available in {{branch-2}} and forward, but not in {{branch-1}}. > LLAP: q file test is broken again > - > > Key: HIVE-10024 > URL: https://issues.apache.org/jira/browse/HIVE-10024 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17766) Support non-equi LEFT SEMI JOIN
[ https://issues.apache.org/jira/browse/HIVE-17766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17766: --- Attachment: HIVE-17766.04.patch > Support non-equi LEFT SEMI JOIN > --- > > Key: HIVE-17766 > URL: https://issues.apache.org/jira/browse/HIVE-17766 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17766.01.patch, HIVE-17766.02.patch, > HIVE-17766.03.patch, HIVE-17766.04.patch, HIVE-17766.patch > > > Currently we get an error like {noformat}Non equality condition not supported > in Semi-Join{noformat} > This is required to generate better plan for EXISTS/IN correlated subquery > where such queries are transformed into LEFT SEMI JOIN. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
[ https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-17911: --- Attachment: HIVE-17911.2.patch > org.apache.hadoop.hive.metastore.ObjectStore - Tune Up > -- > > Key: HIVE-17911 > URL: https://issues.apache.org/jira/browse/HIVE-17911 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-17911.1.patch, HIVE-17911.2.patch > > > # Remove unused variables > # Add logging parameterization > # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection > empty check (and always use null check) > # Minor tweaks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14247) Disable parallel query execution within a session
[ https://issues.apache.org/jira/browse/HIVE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225538#comment-16225538 ] Thejas M Nair commented on HIVE-14247: -- Its been a while since ptest ran on this, running it once more before commit. > Disable parallel query execution within a session > - > > Key: HIVE-14247 > URL: https://issues.apache.org/jira/browse/HIVE-14247 > Project: Hive > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-14247.1.patch, HIVE-14247.1.patch > > > HIVE-11402 leaves the parallel compilation enabled within a session. > This is patch for those who want it to be disabled by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-14247) Disable parallel query execution within a session
[ https://issues.apache.org/jira/browse/HIVE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-14247: - Attachment: HIVE-14247.1.patch > Disable parallel query execution within a session > - > > Key: HIVE-14247 > URL: https://issues.apache.org/jira/browse/HIVE-14247 > Project: Hive > Issue Type: Bug >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-14247.1.patch, HIVE-14247.1.patch > > > HIVE-11402 leaves the parallel compilation enabled within a session. > This is patch for those who want it to be disabled by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17911) org.apache.hadoop.hive.metastore.ObjectStore - Tune Up
[ https://issues.apache.org/jira/browse/HIVE-17911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-17911: --- Status: Open (was: Patch Available) > org.apache.hadoop.hive.metastore.ObjectStore - Tune Up > -- > > Key: HIVE-17911 > URL: https://issues.apache.org/jira/browse/HIVE-17911 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HIVE-17911.1.patch, HIVE-17911.2.patch > > > # Remove unused variables > # Add logging parameterization > # Use CollectionUtils.isEmpty/isNotEmpty to simplify and unify collection > empty check (and always use null check) > # Minor tweaks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17937) llap_acid_test is flaky
[ https://issues.apache.org/jira/browse/HIVE-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17937: --- > llap_acid_test is flaky > --- > > Key: HIVE-17937 > URL: https://issues.apache.org/jira/browse/HIVE-17937 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > > See for example > https://builds.apache.org/job/PreCommit-HIVE-Build/7521/testReport/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_llap_acid_fast_/history/ > (the history link is the same from any build number with a test run, just > replace 7521 if this one expires). > Looks like results change, which may not be good. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17912) org.apache.hadoop.hive.metastore.security.DBTokenStore - Parameterize Logging
[ https://issues.apache.org/jira/browse/HIVE-17912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225528#comment-16225528 ] BELUGA BEHR commented on HIVE-17912: Unrelated failures > org.apache.hadoop.hive.metastore.security.DBTokenStore - Parameterize Logging > - > > Key: HIVE-17912 > URL: https://issues.apache.org/jira/browse/HIVE-17912 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-17912.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)