[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627582#comment-14627582 ] Lefty Leverenz commented on HIVE-10114: --- Very nice doc, [~gopalv]. I removed the TODOC1.2 label. Here's a link to the doc: * [Configuration Properties -- hive.exec.orc.split.strategy | https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27842758#ConfigurationProperties-hive.exec.orc.split.strategy] Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.2.0 Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch, HIVE-10114.5.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625880#comment-14625880 ] Gopal V commented on HIVE-10114: [~leftylev]: doc'd on the cwiki, please confirm remove TODOC. Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch, HIVE-10114.5.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483855#comment-14483855 ] Lefty Leverenz commented on HIVE-10114: --- No doc needed? This adds *hive.exec.orc.split.strategy* to HiveConf.java but the description says it is not a user level config. Is it an admin level config or just for internal use? Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.2.0 Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch, HIVE-10114.5.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394180#comment-14394180 ] Hive QA commented on HIVE-10114: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12709048/HIVE-10114.5.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8693 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3260/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3260/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3260/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12709048 - PreCommit-HIVE-TRUNK-Build Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch, HIVE-10114.5.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392768#comment-14392768 ] Hive QA commented on HIVE-10114: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12708835/HIVE-10114.4.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8693 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3250/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3250/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3250/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12708835 - PreCommit-HIVE-TRUNK-Build Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch, HIVE-10114.4.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390124#comment-14390124 ] Hive QA commented on HIVE-10114: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12708528/HIVE-10114.3.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8692 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3231/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3231/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3231/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12708528 - PreCommit-HIVE-TRUNK-Build Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch, HIVE-10114.3.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10114) Split strategies for ORC
[ https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385188#comment-14385188 ] Hive QA commented on HIVE-10114: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12707872/HIVE-10114.2.patch {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8679 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_where_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_whole_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_all_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_where_partitioned org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_whole_partition org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_all_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_where_partitioned org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3191/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3191/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3191/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12707872 - PreCommit-HIVE-TRUNK-Build Split strategies for ORC Key: HIVE-10114 URL: https://issues.apache.org/jira/browse/HIVE-10114 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch ORC split generation does not have clearly defined strategies for different scenarios (many small orc files, few small orc files, many large files etc.). Few strategies like storing the file footer in orc split, making entire file as a orc split already exists. This JIRA to make the split generation simpler, support different strategies for various use cases (BI, ETL, ACID etc.) and to lay the foundation for HIVE-7428. -- This message was sent by Atlassian JIRA (v6.3.4#6332)