[jira] [Commented] (HIVE-5179) Wincompat : change script tests from bash to sh
[ https://issues.apache.org/jira/browse/HIVE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754425#comment-13754425 ] Hive QA commented on HIVE-5179: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600669/HIVE-5179.patch {color:green}SUCCESS:{color} +1 2902 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/567/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/567/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Wincompat : change script tests from bash to sh --- Key: HIVE-5179 URL: https://issues.apache.org/jira/browse/HIVE-5179 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5179.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4601) WebHCat needs to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4601: Summary: WebHCat needs to support proxy users (was: WebHCat need to support proxy users) WebHCat needs to support proxy users Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4601) WebHCat need to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4601: Summary: WebHCat need to support proxy users (was: WebHCat, Templeton need to support proxy users) WebHCat need to support proxy users --- Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4601) WebHCat needs to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4601: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Eugene for the contribution! WebHCat needs to support proxy users Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4617) ExecuteStatementAsync call to run a query in non-blocking mode
[ https://issues.apache.org/jira/browse/HIVE-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754449#comment-13754449 ] Thejas M Nair commented on HIVE-4617: - [~cwsteinbach] Thanks for your review comments. I agree with Vaibhav, I think it makes sense to address them in this jira. ExecuteStatementAsync call to run a query in non-blocking mode -- Key: HIVE-4617 URL: https://issues.apache.org/jira/browse/HIVE-4617 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Jaideep Dhok Assignee: Vaibhav Gumashta Attachments: HIVE-4617.D12417.1.patch, HIVE-4617.D12417.2.patch, HIVE-4617.D12417.3.patch, HIVE-4617.D12417.4.patch, HIVE-4617.D12417.5.patch, HIVE-4617.D12417.6.patch, HIVE-4617.D12507.1.patch, HIVE-4617.D12507Test.1.patch Provide a way to run a queries asynchronously. Current executeStatement call blocks until the query run is complete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4460) Publish HCatalog artifacts for Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4460: Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks Eugene for the contribution! Publish HCatalog artifacts for Hadoop 2.x - Key: HIVE-4460 URL: https://issues.apache.org/jira/browse/HIVE-4460 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.12.0 Environment: Hadoop 2.x Reporter: Venkat Ranganathan Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE-4460.2.patch, HIVE-4460.3.patch, HIVE-4460.4.patch, HIVE-4460.patch Original Estimate: 72h Time Spent: 40h 40m Remaining Estimate: 31h 20m HCatalog artifacts are only published for Hadoop 1.x version. As more projects add HCatalog integration, the need for HCatalog artifcats on Hadoop versions supported by the product is needed so that automated builds that target different Hadoop releases can be built successfully. For example SQOOP-931 introduces Sqoop/HCatalog integration and Sqoop builds with Hadoop 1.x and 2.x releases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5184) Load filesystem, ugi, metastore client at tez session startup
Gunther Hagleitner created HIVE-5184: Summary: Load filesystem, ugi, metastore client at tez session startup Key: HIVE-5184 URL: https://issues.apache.org/jira/browse/HIVE-5184 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5184) Load filesystem, ugi, metastore client at tez session startup
[ https://issues.apache.org/jira/browse/HIVE-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5184: - Description: Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. NO PRECOMMIT TESTS (this is wip for the tez branch) was:Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. Load filesystem, ugi, metastore client at tez session startup - Key: HIVE-5184 URL: https://issues.apache.org/jira/browse/HIVE-5184 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5184) Load filesystem, ugi, metastore client at tez session startup
[ https://issues.apache.org/jira/browse/HIVE-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-5184: - Attachment: HIVE-5184.1.patch Load filesystem, ugi, metastore client at tez session startup - Key: HIVE-5184 URL: https://issues.apache.org/jira/browse/HIVE-5184 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Attachments: HIVE-5184.1.patch Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-5184) Load filesystem, ugi, metastore client at tez session startup
[ https://issues.apache.org/jira/browse/HIVE-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-5184. -- Resolution: Fixed Committed to branch. Load filesystem, ugi, metastore client at tez session startup - Key: HIVE-5184 URL: https://issues.apache.org/jira/browse/HIVE-5184 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Attachments: HIVE-5184.1.patch Make sure the session is ready to go when we connect. That way once the session/connection is open, we're ready to go. NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-5183) Tez EdgeProperty class has changed
[ https://issues.apache.org/jira/browse/HIVE-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-5183. -- Resolution: Fixed Committed to branch. Tez EdgeProperty class has changed -- Key: HIVE-5183 URL: https://issues.apache.org/jira/browse/HIVE-5183 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: tez-branch Attachments: HIVE-5183.1.patch Tez has changed the names of its EdgeProperties. Need to update the code to use the new names. NO PRECOMMIT TESTS (this is wip for the tez branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754474#comment-13754474 ] Sean Busbey commented on HIVE-4789: --- Yeah sure. I have the changes isolated, waiting for a full unit test run so I can figure out if I need to update any .out files. Not sure yet how to phrase the follow on ticket title, since I don't have a failure test yet, though I think I just need to get the SimpleFetchOptimizer to go into aggressive mode. FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4844) Add char/varchar data types
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754496#comment-13754496 ] Hive QA commented on HIVE-4844: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600709/HIVE-4844.11.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 2918 tests executed *Failed tests:* {noformat} org.apache.hcatalog.pig.TestHCatStorerMulti.testStoreBasicTable org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/569/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/569/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. Add char/varchar data types --- Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, screenshot.png Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4959) Vectorized plan generation should be added as an optimization transform.
[ https://issues.apache.org/jira/browse/HIVE-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754499#comment-13754499 ] Hive QA commented on HIVE-4959: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600713/HIVE-4959.3.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/570/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/570/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-570/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java' Reverted 'jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/JdbcColumn.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/Utils.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveMetaDataResultSet.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveResultSetMetaData.java' Reverted 'jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveQueryResultSet.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/Utils.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/HiveResultSetMetaData.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java' Reverted 'metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java' Reverted 'data/files/datatypes.txt' Reverted 'contrib/src/java/org/apache/hadoop/hive/contrib/util/typedbytes/TypedBytesRecordReader.java' Reverted 'service/src/java/org/apache/hive/service/cli/TypeDescriptor.java' Reverted 'service/src/java/org/apache/hive/service/cli/Type.java' Reverted 'service/src/java/org/apache/hive/service/cli/ColumnValue.java' Reverted 'service/src/java/org/apache/hive/service/cli/ColumnDescriptor.java' Reverted 'service/src/gen/thrift/gen-py/TCLIService/ttypes.py' Reverted 'service/src/gen/thrift/gen-py/TCLIService/constants.py' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.h' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.h' Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_constants.rb' Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeDesc.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIServiceConstants.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStructTypeEntry.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TPrimitiveTypeEntry.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TUnionTypeEntry.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatus.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TColumn.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TExecuteStatementReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTableSchema.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTablesReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeId.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TOpenSessionResp.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRow.java' Reverted 'service/if/TCLIService.thrift' Reverted
[jira] [Commented] (HIVE-3976) Support specifying scale and precision with Hive decimal type
[ https://issues.apache.org/jira/browse/HIVE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754523#comment-13754523 ] Jason Dere commented on HIVE-3976: -- One note about propagating precision/scale throughout the expressions, especially if we want to have them available through jdbc/odbc. All of the add/sub/mult/div operations are implemented as old-style UDFs, which is a bit problematic. The old-style UDFs use reflection to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. The way this is being done, we cannot customize the precision/scale of the TypeInfo representing the result - the resulting TypeInfo would just be the default decimal type with no parameters, and whatever default precision/scale information that comes with that type. So if you want the type metadata to have the correctly set precision/scale, all of the arithmetic operators would need to be redone as GenericUDFs, which allow you to customize the return type ObjectInspector during the initialize() method. I had to do the same thing with a few string UDFs to get the varchar length reported back correctly in the TypeInfos. Support specifying scale and precision with Hive decimal type - Key: HIVE-3976 URL: https://issues.apache.org/jira/browse/HIVE-3976 Project: Hive Issue Type: Improvement Components: Query Processor, Types Reporter: Mark Grover Assignee: Xuefu Zhang Attachments: remove_prec_scale.diff HIVE-2693 introduced support for Decimal datatype in Hive. However, the current implementation has unlimited precision and provides no way to specify precision and scale when creating the table. For example, MySQL allows users to specify scale and precision of the decimal datatype when creating the table: {code} CREATE TABLE numbers (a DECIMAL(20,2)); {code} Hive should support something similar too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows
[ https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754555#comment-13754555 ] Hive QA commented on HIVE-5176: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600728/HIVE-5176.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 2902 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2_hadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape2 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/571/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/571/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. Wincompat : Changes for allowing various path compatibilities with Windows -- Key: HIVE-5176 URL: https://issues.apache.org/jira/browse/HIVE-5176 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5176.patch We need to make certain changes across the board to allow us to read/parse windows paths. Some are escaping changes, some are being strict about how we read paths (through URL.encode/decode, etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5177) Wincompat : Retrying handler related changes
[ https://issues.apache.org/jira/browse/HIVE-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754620#comment-13754620 ] Hive QA commented on HIVE-5177: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600735/HIVE-5177.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2902 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/572/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/572/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. Wincompat : Retrying handler related changes Key: HIVE-5177 URL: https://issues.apache.org/jira/browse/HIVE-5177 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5177.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5029) direct SQL perf optimization cannot be tested well
[ https://issues.apache.org/jira/browse/HIVE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5029: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Sergey! direct SQL perf optimization cannot be tested well -- Key: HIVE-5029 URL: https://issues.apache.org/jira/browse/HIVE-5029 Project: Hive Issue Type: Test Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.12.0 Attachments: HIVE-5029.D12483.1.patch, HIVE-5029.D12483.2.patch, HIVE-5029.patch, HIVE-5029.patch HIVE-4051 introduced perf optimization that involves getting partitions directly via SQL in metastore. Given that SQL queries might not work on all datastores (and will not work on non-SQL ones), JDO fallback is in place. Given that perf improvement is very large for short queries, it's on by default. However, there's a problem with tests with regard to that. If SQL code is broken, tests may fall back to JDO and pass. If JDO code is broken, SQL might allow tests to pass. We are going to disable SQL by default before the testing problem is resolved. There are several possible solultions: 1) Separate build for this setting. Seems like an overkill... 2) Enable by default; disable by default in tests, create a clone of TestCliDriver with a subset of queries that will exercise the SQL path. 3) Have some sort of test hook inside metastore that will run both ORM and SQL and compare. 3') Or make a subclass of ObjectStore that will do that. ObjectStore is already pluggable. 4) Write unit tests for one of the modes (JDO, as non-default?) and declare that they are sufficient; disable fallback in tests. 3' seems like the easiest. For now we will disable SQL by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754651#comment-13754651 ] Hudson commented on HIVE-5091: -- ABORTED: Integrated in Hive-trunk-hadoop2 #389 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/389/]) HIVE-5091: ORC files should have an option to pad stripes to the HDFS block boundaries (Owen O'Malley via Gunther Hagleitner) (gunther: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518830) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java * /hive/trunk/ql/src/test/resources/orc-file-dump.out ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5091.D12249.1.patch, HIVE-5091.D12249.2.patch, HIVE-5091.D12249.3.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4964) Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced
[ https://issues.apache.org/jira/browse/HIVE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754650#comment-13754650 ] Hudson commented on HIVE-4964: -- ABORTED: Integrated in Hive-trunk-hadoop2 #389 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/389/]) HIVE-4964 : Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced (Harish Butani via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518680) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDesc.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDeserializer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java Cleanup PTF code: remove code dealing with non standard sql behavior we had original introduced --- Key: HIVE-4964 URL: https://issues.apache.org/jira/browse/HIVE-4964 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4964.D11985.1.patch, HIVE-4964.D11985.2.patch, HIVE-4964.D12585.1.patch There are still pieces of code that deal with: - supporting select expressions with Windowing - supporting a filter with windowing Need to do this before introducing Perf. improvements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4914) filtering via partition name should be done inside metastore server (implementation)
[ https://issues.apache.org/jira/browse/HIVE-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754685#comment-13754685 ] Phabricator commented on HIVE-4914: --- ashutoshc has requested changes to the revision HIVE-4914 [jira] filtering via partition name should be done inside metastore server (implementation). Comments. INLINE COMMENTS metastore/if/hive_metastore.thrift:282 Can you add a comment here defining this boolean? metastore/if/hive_metastore.thrift:510 Instead of list of parameters, can you define a struct which is passed in as an argument. That way in future if we need to add another parameter for this function, it will still be back-compat. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:252 Consider using MetaStoreUtils::newInstance() for this. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:249 Can you add a comment here saying this class is created via reflection to avoid circular dependency on ql package? metastore/if/hive_metastore.thrift:281 SetPartition instead of listPartition ? ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:569 Why not enhance existing deserializeExpressions() to allow it to throw exception? Or, atleast reuse the common code. metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:494 Add description of @param FilterBuilder in javadoc here. metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:445 Update javadoc with new param. metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:418 Update javadoc. metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:113 ExpressionTree is getting too large. Better to put this class and FilterBuilder in another file? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1887 With query.setRange() this is no longer required. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1944 Didn't get this. With this patch, Hive client wont do any work, right? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1971 This TODO is important to resolve. Can you follow up on this? metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1965 Is this just for tests? Or is it needed? Either way, can you add a comment for it. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:5756 This class is getting too large. May be a good idea to put some of the helper inner classes and methods in MetaStoreUtils class. metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:496 This class is also growing in size. Probably, put it in a seprate file along with TreeVisitor. REVISION DETAIL https://reviews.facebook.net/D12561 BRANCH HIVE-4914-no-gen ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe filtering via partition name should be done inside metastore server (implementation) Key: HIVE-4914 URL: https://issues.apache.org/jira/browse/HIVE-4914 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4914.01.patch, HIVE-4914.D12561.1.patch, HIVE-4914-only-no-gen.patch, HIVE-4914-only.patch, HIVE-4914.patch, HIVE-4914.patch, HIVE-4914.patch Currently, if the filter pushdown is impossible (which is most cases), the client gets all partition names from metastore, filters them, and asks for partitions by names for the filtered set. Metastore server code should do that instead; it should check if pushdown is possible and do it if so; otherwise it should do name-based filtering. Saves the roundtrip with all partition names from the server to client, and also removes the need to have pushdown viability checking on both sides. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12100: Patch to fix HIVE-4789
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12100/ --- (Updated Aug. 30, 2013, 1:46 p.m.) Review request for hive, Ashutosh Chauhan, Jakob Homan, and Mark Wagner. Changes --- Per feedback, breaking MetaStoreUtils changes into another ticket. Repository: hive Description --- HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; Diffs (updated) - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 1518830 trunk/ql/src/test/queries/clientpositive/avro_partitioned.q 1518830 trunk/ql/src/test/results/clientpositive/avro_partitioned.q.out 1518830 Diff: https://reviews.apache.org/r/12100/diff/ Testing --- reran avro partition unit tests and partition_wise_fileformat*.q Thanks, Sean Busbey
[jira] [Updated] (HIVE-5172) TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server
[ https://issues.apache.org/jira/browse/HIVE-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] agate updated HIVE-5172: Affects Version/s: 0.10.0 0.11.0 TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server - Key: HIVE-5172 URL: https://issues.apache.org/jira/browse/HIVE-5172 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0, 0.10.0, 0.11.0 Reporter: agate Attachments: HIVE-5172.1.patch.txt We are running into frequent problem using HCatalog 0.4.1 (Hive Metastore Server 0.9) where we get connection reset or connection timeout errors on the client and NullPointerException in TUGIBasedProcessor on the server. {code} hive client logs: = org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2136) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2122) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openStore(HiveMetaStoreClient.java:286) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:197) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:157) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:830) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:954) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7524) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 31 more {code} {code} hive metastore server logs: === 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message. java.lang.NullPointerException at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:79) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} Adding some extra debug log messages in TUGIBasedProcessor, noticed that the TUGIContainingTransport is null which
[jira] [Updated] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HIVE-4789: -- Attachment: HIVE-4789.3.patch.txt Modified patch with just the FetchOperator changes. [reviewboard updated|https://reviews.apache.org/r/12100/] FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt, HIVE-4789.3.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4460) Publish HCatalog artifacts for Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754700#comment-13754700 ] Hudson commented on HIVE-4460: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #77 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/77/]) Add missing files from - HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518911) * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/shims/HCatHadoopShims.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hcatalog/shims/HCatHadoopShims20S.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hcatalog/shims/HCatHadoopShims23.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred/WebHCatJTShim23.java HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518897) * /hive/trunk/hcatalog/build-support/ant/deploy.xml * /hive/trunk/hcatalog/build.properties * /hive/trunk/hcatalog/build.xml * /hive/trunk/hcatalog/core/build.xml * /hive/trunk/hcatalog/core/src/main/java/org/apache/hadoop/mapred/HCatMapRedUtil.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatInputFormatReader.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatOutputFormatWriter.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/FileOutputCommitterContainer.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/MultiOutputFormat.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/Security.java * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/rcfile/TestRCFileMapReduceInputFormat.java * /hive/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java * /hive/trunk/hcatalog/webhcat/svr/pom.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/DeleteDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ListDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/StatusDelegator.java * /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java * /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java Publish HCatalog artifacts for Hadoop 2.x - Key: HIVE-4460 URL: https://issues.apache.org/jira/browse/HIVE-4460 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.12.0 Environment: Hadoop 2.x Reporter: Venkat Ranganathan Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE-4460.2.patch, HIVE-4460.3.patch, HIVE-4460.4.patch, HIVE-4460.patch Original Estimate: 72h Time Spent: 40h 40m Remaining Estimate: 31h 20m HCatalog artifacts are only published for Hadoop 1.x version. As more projects add HCatalog integration, the need for HCatalog artifcats on Hadoop versions supported by the product is needed so that automated builds that target different Hadoop releases can be built successfully. For example SQOOP-931 introduces Sqoop/HCatalog integration and Sqoop builds with Hadoop 1.x and 2.x releases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4601) WebHCat needs to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754702#comment-13754702 ] Hudson commented on HIVE-4601: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #77 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/77/]) Add missing files from - HIVE-4601 : WebHCat needs to support proxy users (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518912) * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/doas.conf * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ProxyUserSupport.java HIVE-4601 : WebHCat needs to support proxy users (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518894) * /hive/trunk/hcatalog/src/docs/src/documentation/content/xdocs/configuration.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/README.txt * /hive/trunk/hcatalog/src/test/e2e/templeton/build.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm * /hive/trunk/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/AppConfig.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/LauncherDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/Server.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/UgiFactory.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/HDFSStorage.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/NotFoundException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/TempletonUtils.java WebHCat needs to support proxy users Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754701#comment-13754701 ] Hudson commented on HIVE-5091: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #77 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/77/]) HIVE-5091: ORC files should have an option to pad stripes to the HDFS block boundaries (Owen O'Malley via Gunther Hagleitner) (gunther: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518830) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java * /hive/trunk/ql/src/test/resources/orc-file-dump.out ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5091.D12249.1.patch, HIVE-5091.D12249.2.patch, HIVE-5091.D12249.3.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4601) WebHCat needs to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754708#comment-13754708 ] Hudson commented on HIVE-4601: -- FAILURE: Integrated in Hive-trunk-hadoop2 #390 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/390/]) Add missing files from - HIVE-4601 : WebHCat needs to support proxy users (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518912) * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/doas.conf * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ProxyUserSupport.java HIVE-4601 : WebHCat needs to support proxy users (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518894) * /hive/trunk/hcatalog/src/docs/src/documentation/content/xdocs/configuration.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/README.txt * /hive/trunk/hcatalog/src/test/e2e/templeton/build.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm * /hive/trunk/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/AppConfig.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/LauncherDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/Server.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/UgiFactory.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/HDFSStorage.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/NotFoundException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/TempletonUtils.java WebHCat needs to support proxy users Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4460) Publish HCatalog artifacts for Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754707#comment-13754707 ] Hudson commented on HIVE-4460: -- FAILURE: Integrated in Hive-trunk-hadoop2 #390 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/390/]) Add missing files from - HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518911) * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/shims/HCatHadoopShims.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hcatalog/shims/HCatHadoopShims20S.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hcatalog/shims/HCatHadoopShims23.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred/WebHCatJTShim23.java HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518897) * /hive/trunk/hcatalog/build-support/ant/deploy.xml * /hive/trunk/hcatalog/build.properties * /hive/trunk/hcatalog/build.xml * /hive/trunk/hcatalog/core/build.xml * /hive/trunk/hcatalog/core/src/main/java/org/apache/hadoop/mapred/HCatMapRedUtil.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatInputFormatReader.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatOutputFormatWriter.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/FileOutputCommitterContainer.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/MultiOutputFormat.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/Security.java * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/rcfile/TestRCFileMapReduceInputFormat.java * /hive/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java * /hive/trunk/hcatalog/webhcat/svr/pom.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/DeleteDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ListDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/StatusDelegator.java * /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java * /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java Publish HCatalog artifacts for Hadoop 2.x - Key: HIVE-4460 URL: https://issues.apache.org/jira/browse/HIVE-4460 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.12.0 Environment: Hadoop 2.x Reporter: Venkat Ranganathan Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE-4460.2.patch, HIVE-4460.3.patch, HIVE-4460.4.patch, HIVE-4460.patch Original Estimate: 72h Time Spent: 40h 40m Remaining Estimate: 31h 20m HCatalog artifacts are only published for Hadoop 1.x version. As more projects add HCatalog integration, the need for HCatalog artifcats on Hadoop versions supported by the product is needed so that automated builds that target different Hadoop releases can be built successfully. For example SQOOP-931 introduces Sqoop/HCatalog integration and Sqoop builds with Hadoop 1.x and 2.x releases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5029) direct SQL perf optimization cannot be tested well
[ https://issues.apache.org/jira/browse/HIVE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754706#comment-13754706 ] Hudson commented on HIVE-5029: -- FAILURE: Integrated in Hive-trunk-hadoop2 #390 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/390/]) HIVE-5029 : direct SQL perf optimization cannot be tested well (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518953) * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java direct SQL perf optimization cannot be tested well -- Key: HIVE-5029 URL: https://issues.apache.org/jira/browse/HIVE-5029 Project: Hive Issue Type: Test Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.12.0 Attachments: HIVE-5029.D12483.1.patch, HIVE-5029.D12483.2.patch, HIVE-5029.patch, HIVE-5029.patch HIVE-4051 introduced perf optimization that involves getting partitions directly via SQL in metastore. Given that SQL queries might not work on all datastores (and will not work on non-SQL ones), JDO fallback is in place. Given that perf improvement is very large for short queries, it's on by default. However, there's a problem with tests with regard to that. If SQL code is broken, tests may fall back to JDO and pass. If JDO code is broken, SQL might allow tests to pass. We are going to disable SQL by default before the testing problem is resolved. There are several possible solultions: 1) Separate build for this setting. Seems like an overkill... 2) Enable by default; disable by default in tests, create a clone of TestCliDriver with a subset of queries that will exercise the SQL path. 3) Have some sort of test hook inside metastore that will run both ORM and SQL and compare. 3') Or make a subclass of ObjectStore that will do that. ObjectStore is already pluggable. 4) Write unit tests for one of the modes (JDO, as non-default?) and declare that they are sufficient; disable fallback in tests. 3' seems like the easiest. For now we will disable SQL by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5185) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless
[ https://issues.apache.org/jira/browse/HIVE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5185: --- Description: The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. was: The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5185) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless
[ https://issues.apache.org/jira/browse/HIVE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5185: --- Description: The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {/code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {/code} was: The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {/code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {/code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5185) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless
[ https://issues.apache.org/jira/browse/HIVE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5185: --- Description: The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} DROP TABLE part; -- data setup CREATE TABLE part(   p_partkey INT,   p_name STRING,   p_mfgr STRING,   p_brand STRING,   p_type STRING,   p_size INT,   p_container STRING,   p_retailprice DOUBLE,   p_comment STRING ); select p_name from (select p_name from part distribute by 1 sort by 1) p distribute by 1 sort by 1 ; {code} was: The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {\code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {\code} test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} DROP TABLE part; -- data setup CREATE TABLE part(   p_partkey INT,   p_name STRING,   p_mfgr STRING,   p_brand STRING,   p_type STRING,   p_size INT,   p_container STRING,   p_retailprice DOUBLE,   p_comment STRING ); select p_name from (select p_name from part distribute by 1 sort by 1) p distribute by 1 sort by 1 ; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5185) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless
[ https://issues.apache.org/jira/browse/HIVE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5185: --- Description: The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {\code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {\code} was: The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {/code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {/code} test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file reduce_deduplicate_exclude_gby.q contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {\code} Since the table is not populated, there is no result will be in the .out file. The same thing in reducesink-dedup.q {code:sql} {\code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5186) Remove JoinReducerProc from ReduceSinkDeDuplication
Yin Huai created HIVE-5186: -- Summary: Remove JoinReducerProc from ReduceSinkDeDuplication Key: HIVE-5186 URL: https://issues.apache.org/jira/browse/HIVE-5186 Project: Hive Issue Type: Improvement Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Correlation Optimizer will take care patterns involving JoinOperator. We can remove JoinReducerProc from ReduceSinkDeDuplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754755#comment-13754755 ] Hudson commented on HIVE-5091: -- SUCCESS: Integrated in Hive-trunk-h0.21 #2298 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2298/]) HIVE-5091: ORC files should have an option to pad stripes to the HDFS block boundaries (Owen O'Malley via Gunther Hagleitner) (gunther: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518830) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java * /hive/trunk/ql/src/test/resources/orc-file-dump.out ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5091.D12249.1.patch, HIVE-5091.D12249.2.patch, HIVE-5091.D12249.3.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5185) test query file reduce_deduplicate_exclude_gby.q is useless
Yin Huai created HIVE-5185: -- Summary: test query file reduce_deduplicate_exclude_gby.q is useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5186) Remove JoinReducerProc from ReduceSinkDeDuplication
[ https://issues.apache.org/jira/browse/HIVE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5186: --- Attachment: HIVE-5186.1.patch.txt Remove JoinReducerProc from ReduceSinkDeDuplication --- Key: HIVE-5186 URL: https://issues.apache.org/jira/browse/HIVE-5186 Project: Hive Issue Type: Improvement Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-5186.1.patch.txt Correlation Optimizer will take care patterns involving JoinOperator. We can remove JoinReducerProc from ReduceSinkDeDuplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 13862: [HIVE-5149] ReduceSinkDeDuplication can pick the wrong partitioning columns
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13862/ --- (Updated Aug. 30, 2013, 3:29 p.m.) Review request for hive. Summary (updated) - [HIVE-5149] ReduceSinkDeDuplication can pick the wrong partitioning columns Bugs: HIVE-5149 https://issues.apache.org/jira/browse/HIVE-5149 Repository: hive-git Description --- https://mail-archives.apache.org/mod_mbox/hive-user/201308.mbox/%3CCAG6Lhyex5XPwszpihKqkPRpzri2k=m4qgc+cpar5yvr8sjt...@mail.gmail.com%3E Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java c380a2d ql/src/test/results/clientpositive/groupby2_map_skew.q.out da7a128 ql/src/test/results/clientpositive/groupby_cube1.q.out a52f4eb ql/src/test/results/clientpositive/groupby_rollup1.q.out f120471 ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 3297ebb Diff: https://reviews.apache.org/r/13862/diff/ Testing --- Thanks, Yin Huai
[jira] [Updated] (HIVE-5186) Remove JoinReducerProc from ReduceSinkDeDuplication
[ https://issues.apache.org/jira/browse/HIVE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5186: --- Status: Patch Available (was: Open) Remove JoinReducerProc from ReduceSinkDeDuplication --- Key: HIVE-5186 URL: https://issues.apache.org/jira/browse/HIVE-5186 Project: Hive Issue Type: Improvement Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-5186.1.patch.txt Correlation Optimizer will take care patterns involving JoinOperator. We can remove JoinReducerProc from ReduceSinkDeDuplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754910#comment-13754910 ] Hive QA commented on HIVE-4789: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600784/HIVE-4789.3.patch.txt {color:green}SUCCESS:{color} +1 2902 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/573/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/573/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt, HIVE-4789.3.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5187) Enhance explain to indicate vectorized execution of operators.
Jitendra Nath Pandey created HIVE-5187: -- Summary: Enhance explain to indicate vectorized execution of operators. Key: HIVE-5187 URL: https://issues.apache.org/jira/browse/HIVE-5187 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Explain should be able to indicate whether an operator will be executed in vectorized mode or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 13910: [HIVE-5186] Remove JoinReducerProc from ReduceSinkDeDuplication
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13910/ --- Review request for hive. Bugs: HIVE-5186 https://issues.apache.org/jira/browse/HIVE-5186 Repository: hive-git Description --- Correlation Optimizer will take care patterns involving JoinOperator. We can remove JoinReducerProc from ReduceSinkDeDuplication. Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java c380a2d ql/src/test/queries/clientpositive/reduce_deduplicate_extended.q a5e9cdf ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 3297ebb Diff: https://reviews.apache.org/r/13910/diff/ Testing --- Thanks, Yin Huai
[jira] [Commented] (HIVE-5029) direct SQL perf optimization cannot be tested well
[ https://issues.apache.org/jira/browse/HIVE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754792#comment-13754792 ] Hudson commented on HIVE-5029: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #145 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/145/]) HIVE-5029 : direct SQL perf optimization cannot be tested well (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518953) * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java direct SQL perf optimization cannot be tested well -- Key: HIVE-5029 URL: https://issues.apache.org/jira/browse/HIVE-5029 Project: Hive Issue Type: Test Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.12.0 Attachments: HIVE-5029.D12483.1.patch, HIVE-5029.D12483.2.patch, HIVE-5029.patch, HIVE-5029.patch HIVE-4051 introduced perf optimization that involves getting partitions directly via SQL in metastore. Given that SQL queries might not work on all datastores (and will not work on non-SQL ones), JDO fallback is in place. Given that perf improvement is very large for short queries, it's on by default. However, there's a problem with tests with regard to that. If SQL code is broken, tests may fall back to JDO and pass. If JDO code is broken, SQL might allow tests to pass. We are going to disable SQL by default before the testing problem is resolved. There are several possible solultions: 1) Separate build for this setting. Seems like an overkill... 2) Enable by default; disable by default in tests, create a clone of TestCliDriver with a subset of queries that will exercise the SQL path. 3) Have some sort of test hook inside metastore that will run both ORM and SQL and compare. 3') Or make a subclass of ObjectStore that will do that. ObjectStore is already pluggable. 4) Write unit tests for one of the modes (JDO, as non-default?) and declare that they are sufficient; disable fallback in tests. 3' seems like the easiest. For now we will disable SQL by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4460) Publish HCatalog artifacts for Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754793#comment-13754793 ] Hudson commented on HIVE-4460: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #145 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/145/]) Add missing files from - HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518911) * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/shims/HCatHadoopShims.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/20/java/org/apache/hcatalog/shims/HCatHadoopShims20S.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hadoop/mapred/TempletonJobTracker.java * /hive/trunk/hcatalog/shims/src/23/java/org/apache/hcatalog/shims/HCatHadoopShims23.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/mapred/WebHCatJTShim23.java HIVE-4460 : Publish HCatalog artifacts for Hadoop 2.x (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518897) * /hive/trunk/hcatalog/build-support/ant/deploy.xml * /hive/trunk/hcatalog/build.properties * /hive/trunk/hcatalog/build.xml * /hive/trunk/hcatalog/core/build.xml * /hive/trunk/hcatalog/core/src/main/java/org/apache/hadoop/mapred/HCatMapRedUtil.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatInputFormatReader.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/transfer/impl/HCatOutputFormatWriter.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/FileOutputCommitterContainer.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/MultiOutputFormat.java * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/Security.java * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/rcfile/TestRCFileMapReduceInputFormat.java * /hive/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java * /hive/trunk/hcatalog/webhcat/svr/pom.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/DeleteDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ListDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/StatusDelegator.java * /hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java * /hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java * /hive/trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java * /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java Publish HCatalog artifacts for Hadoop 2.x - Key: HIVE-4460 URL: https://issues.apache.org/jira/browse/HIVE-4460 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.12.0 Environment: Hadoop 2.x Reporter: Venkat Ranganathan Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE-4460.2.patch, HIVE-4460.3.patch, HIVE-4460.4.patch, HIVE-4460.patch Original Estimate: 72h Time Spent: 40h 40m Remaining Estimate: 31h 20m HCatalog artifacts are only published for Hadoop 1.x version. As more projects add HCatalog integration, the need for HCatalog artifcats on Hadoop versions supported by the product is needed so that automated builds that target different Hadoop releases can be built successfully. For example SQOOP-931 introduces Sqoop/HCatalog integration and Sqoop builds with Hadoop 1.x and 2.x releases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4601) WebHCat needs to support proxy users
[ https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754795#comment-13754795 ] Hudson commented on HIVE-4601: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #145 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/145/]) Add missing files from - HIVE-4601 : WebHCat needs to support proxy users (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518912) * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/doas.conf * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/ProxyUserSupport.java HIVE-4601 : WebHCat needs to support proxy users (Eugene Koifman via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518894) * /hive/trunk/hcatalog/src/docs/src/documentation/content/xdocs/configuration.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/README.txt * /hive/trunk/hcatalog/src/test/e2e/templeton/build.xml * /hive/trunk/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm * /hive/trunk/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/AppConfig.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/LauncherDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/Server.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/UgiFactory.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/HDFSStorage.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/NotFoundException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/TempletonUtils.java WebHCat needs to support proxy users Key: HIVE-4601 URL: https://issues.apache.org/jira/browse/HIVE-4601 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0 Reporter: Dilli Arumugam Assignee: Eugene Koifman Labels: proxy, templeton Fix For: 0.12.0 Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.4.patch, HIVE-4601.5.patch, HIVE-4601.patch We have a use case where a Gateway would provide unified and controlled access to secure hadoop cluster. The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton with SPNego. The Gateway would authenticate the end user with http basic and would assert the end user identity as douser argument in the calls to downstream WebHDFS, Oozie and Templeton. This works fine with WebHDFS and Oozie. But, does not work for Templeton as Templeton does not support proxy users. Hence, request to add this improvement to Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4789) FetchOperator fails on partitioned Avro data
[ https://issues.apache.org/jira/browse/HIVE-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754920#comment-13754920 ] Ashutosh Chauhan commented on HIVE-4789: +1 FetchOperator fails on partitioned Avro data Key: HIVE-4789 URL: https://issues.apache.org/jira/browse/HIVE-4789 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Attachments: HIVE-4789.1.patch.txt, HIVE-4789.2.patch.txt, HIVE-4789.3.patch.txt HIVE-3953 fixed using partitioned avro tables for anything that used the MapOperator, but those that rely on FetchOperator still fail with the same error. e.g. {code} SELECT * FROM partitioned_avro LIMIT 5; SELECT * FROM partitioned_avro WHERE partition_col=value; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4575) In place filtering in Not Filter doesn't handle nulls correctly.
[ https://issues.apache.org/jira/browse/HIVE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey resolved HIVE-4575. Resolution: Later The not expression is currently not supported in vectorized path. This expression will need a bit of re-implementation, at which point, this jira will be addressed too. In place filtering in Not Filter doesn't handle nulls correctly. Key: HIVE-4575 URL: https://issues.apache.org/jira/browse/HIVE-4575 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey The FilterNotExpr evaluates the child expression and takes the compliment of the selected vector. Since child expression filters out null values, the compliment includes the nulls in the output. This is incorrect because not(null) = null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5178) Wincompat : QTestUtil changes
[ https://issues.apache.org/jira/browse/HIVE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5178: --- Status: Patch Available (was: Open) Wincompat : QTestUtil changes - Key: HIVE-5178 URL: https://issues.apache.org/jira/browse/HIVE-5178 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5178.2.patch, HIVE-5178.patch Miscellaneous QTestUtil changes are needed to make tests work under windows: a) Aux jars needed to be set up for minimr b) Ignore empty test lines if windows -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5178) Wincompat : QTestUtil changes
[ https://issues.apache.org/jira/browse/HIVE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5178: --- Attachment: HIVE-5178.2.patch Updated patch. Wincompat : QTestUtil changes - Key: HIVE-5178 URL: https://issues.apache.org/jira/browse/HIVE-5178 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5178.2.patch, HIVE-5178.patch Miscellaneous QTestUtil changes are needed to make tests work under windows: a) Aux jars needed to be set up for minimr b) Ignore empty test lines if windows -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5178) Wincompat : QTestUtil changes
[ https://issues.apache.org/jira/browse/HIVE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5178: --- Status: Open (was: Patch Available) (canceling patch, there was a typo in this patch) Wincompat : QTestUtil changes - Key: HIVE-5178 URL: https://issues.apache.org/jira/browse/HIVE-5178 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5178.2.patch, HIVE-5178.patch Miscellaneous QTestUtil changes are needed to make tests work under windows: a) Aux jars needed to be set up for minimr b) Ignore empty test lines if windows -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5158) allow getting all partitions for table to also use direct SQL path
[ https://issues.apache.org/jira/browse/HIVE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754963#comment-13754963 ] Edward Capriolo commented on HIVE-5158: --- A thing to keep in mind is that some tables with a huge number of partitions will OOM a client if you attempt to fetch them all at once. So some support for paging is required. allow getting all partitions for table to also use direct SQL path -- Key: HIVE-5158 URL: https://issues.apache.org/jira/browse/HIVE-5158 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5158.D12573.1.patch, HIVE-5158.D12573.2.patch, HIVE-5158.D12573.3.patch, HIVE-5158.D12573.4.patch While testing some queries I noticed that getPartitions can be very slow (which happens e.g. in non-strict mode with no partition column filter); with a table with many partitions it can take 10-12s easily. SQL perf path can also be used for this path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5022) Decimal Arithmetic generates NULL value
[ https://issues.apache.org/jira/browse/HIVE-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754950#comment-13754950 ] Jason Dere commented on HIVE-5022: -- There's the actual SQL reference rules regarding exact precision arithmetic (6.12 if you're looking at SQL92, and it looks like later references look the same): 1) If the data type of both operands of a dyadic arithmetic opera- tor is exact numeric, then the data type of the result is exact numeric, with precision and scale determined as follows: a) Let S1 and S2 be the scale of the first and second operands respectively. b) The precision of the result of addition and subtraction is implementation-defined, and the scale is the maximum of S1 and S2. c) The precision of the result of multiplication is implementation- defined, and the scale is S1 + S2. d) The precision and scale of the result of division is implementation-defined. I'd agree with what hagleitn said about the resulting precision/scale of division operations, Hive is allowed to define what precision/scale it returns on division, and it probably should not be allowed to take up the entire precision. Tinkering with MySQL a bit, it looks like it follows the multiplication scale rules until it hits the max scale of 30, and then any further multiplications continue to have scale 30. Not quite sure what rules it's using for division scale, but it also does not exceed their max scale of 30. Decimal Arithmetic generates NULL value --- Key: HIVE-5022 URL: https://issues.apache.org/jira/browse/HIVE-5022 Project: Hive Issue Type: Bug Components: Types Affects Versions: 0.11.0 Environment: Hortonworks 1.3 running Hive 0.11.0.1.3.0.0-107 Reporter: Kevin Soo Hoo Assignee: Teddy Choi Attachments: HIVE-5022.1.patch.txt, HIVE-5022.2.patch.txt, HIVE-5022.3.patch.txt When a decimal division is the first operation, the quotient cannot be multiplied in a subsequent calculation. Instead, a NULL is returned. The following yield NULL results: select (cast (4.53 as decimal) / cast(25.86 as decimal)) * cast(0.087 as decimal) from tablename limit 1; select cast (4.53 as decimal) / cast(25.86 as decimal) * cast(0.087 as decimal) from tablename limit 1; If we move the multiplication operation to be first, then it will successfully calculate the result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5091) ORC files should have an option to pad stripes to the HDFS block boundaries
[ https://issues.apache.org/jira/browse/HIVE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754794#comment-13754794 ] Hudson commented on HIVE-5091: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #145 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/145/]) HIVE-5091: ORC files should have an option to pad stripes to the HDFS block boundaries (Owen O'Malley via Gunther Hagleitner) (gunther: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518830) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestNewIntegerEncoding.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java * /hive/trunk/ql/src/test/resources/orc-file-dump.out ORC files should have an option to pad stripes to the HDFS block boundaries --- Key: HIVE-5091 URL: https://issues.apache.org/jira/browse/HIVE-5091 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.12.0 Attachments: HIVE-5091.D12249.1.patch, HIVE-5091.D12249.2.patch, HIVE-5091.D12249.3.patch With ORC stripes being large, if a stripe straddles an HDFS block, the locality of read is suboptimal. It would be good to add padding to ensure that stripes don't straddle HDFS blocks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5158) allow getting all partitions for table to also use direct SQL path
[ https://issues.apache.org/jira/browse/HIVE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5158: --- Status: Patch Available (was: Open) allow getting all partitions for table to also use direct SQL path -- Key: HIVE-5158 URL: https://issues.apache.org/jira/browse/HIVE-5158 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5158.D12573.1.patch, HIVE-5158.D12573.2.patch, HIVE-5158.D12573.3.patch, HIVE-5158.D12573.4.patch While testing some queries I noticed that getPartitions can be very slow (which happens e.g. in non-strict mode with no partition column filter); with a table with many partitions it can take 10-12s easily. SQL perf path can also be used for this path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows
[ https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5176: --- Status: Open (was: Patch Available) Investigating failure found by the precommit test Wincompat : Changes for allowing various path compatibilities with Windows -- Key: HIVE-5176 URL: https://issues.apache.org/jira/browse/HIVE-5176 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5176.patch We need to make certain changes across the board to allow us to read/parse windows paths. Some are escaping changes, some are being strict about how we read paths (through URL.encode/decode, etc) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5188) MR job launched through WebHCat fails to find additional jars in classpath
Deepesh Khandelwal created HIVE-5188: Summary: MR job launched through WebHCat fails to find additional jars in classpath Key: HIVE-5188 URL: https://issues.apache.org/jira/browse/HIVE-5188 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Deepesh Khandelwal When running MR job using jar (compiled with external dependencies eg. HCatInputFormat, HCatOutputFormat) through WebHCat the job fails complaining about missing HCat classes. I did pass those through the libjars argument but it seems that we run the hadoop jar locally on the tasktracker node and so it doesn't have the additional classes in the HADOOP_CLASSPATH. To get around the problem I had to explicitly add the additional jar dependencies in the hadoop-env.sh. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12480: HIVE-4732 Reduce or eliminate the expensive Schema equals() check for AvroSerde
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/ --- (Updated Aug. 30, 2013, 6:49 p.m.) Review request for hive, Ashutosh Chauhan and Jakob Homan. Changes --- Updated with Jakob's comments Bugs: HIVE-4732 https://issues.apache.org/jira/browse/HIVE-4732 Repository: hive-git Description --- From our performance analysis, we found AvroSerde's schema.equals() call consumed a substantial amount ( nearly 40%) of time. This patch intends to minimize the number schema.equals() calls by pushing the check as late/fewer as possible. At first, we added a unique id for each record reader which is then included in every AvroGenericRecordWritable. Then, we introduce two new data structures (one hashset and one hashmap) to store intermediate data to avoid duplicates checkings. Hashset contains all the record readers' IDs that don't need any re-encoding. On the other hand, HashMap contains the already used re-encoders. It works as cache and allows re-encoders reuse. With this change, our test shows nearly 40% reduction in Avro record reading time. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java ed2a9af serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java e994411 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java 66f0348 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 3828940 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestSchemaReEncoder.java 9af751b serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java 2b948eb Diff: https://reviews.apache.org/r/12480/diff/ Testing --- Thanks, Mohammad Islam
Re: Review Request 12480: HIVE-4732 Reduce or eliminate the expensive Schema equals() check for AvroSerde
On Aug. 26, 2013, 5:35 a.m., Jakob Homan wrote: serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java, line 529 https://reviews.apache.org/r/12480/diff/3/?file=338097#file338097line529 Weird spacing... 2x below as well. Done On Aug. 26, 2013, 5:35 a.m., Jakob Homan wrote: serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java, line 49 https://reviews.apache.org/r/12480/diff/3/?file=338099#file338099line49 And this would indicate a bug. Done. On Aug. 26, 2013, 5:35 a.m., Jakob Homan wrote: serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java, line 38 https://reviews.apache.org/r/12480/diff/3/?file=338099#file338099line38 These should never be null, not even in testing. It's better to change the tests to correctly populate the data structure. AvroGenericRecordWriteable needs the record reader ID. Since we are instantiating one here. We will need to set it w/o any checking. Remove unnecessary null checkings. - Mohammad --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/#review25537 --- On Aug. 30, 2013, 6:49 p.m., Mohammad Islam wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/ --- (Updated Aug. 30, 2013, 6:49 p.m.) Review request for hive, Ashutosh Chauhan and Jakob Homan. Bugs: HIVE-4732 https://issues.apache.org/jira/browse/HIVE-4732 Repository: hive-git Description --- From our performance analysis, we found AvroSerde's schema.equals() call consumed a substantial amount ( nearly 40%) of time. This patch intends to minimize the number schema.equals() calls by pushing the check as late/fewer as possible. At first, we added a unique id for each record reader which is then included in every AvroGenericRecordWritable. Then, we introduce two new data structures (one hashset and one hashmap) to store intermediate data to avoid duplicates checkings. Hashset contains all the record readers' IDs that don't need any re-encoding. On the other hand, HashMap contains the already used re-encoders. It works as cache and allows re-encoders reuse. With this change, our test shows nearly 40% reduction in Avro record reading time. Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java ed2a9af serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java e994411 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java 66f0348 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 3828940 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestSchemaReEncoder.java 9af751b serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java 2b948eb Diff: https://reviews.apache.org/r/12480/diff/ Testing --- Thanks, Mohammad Islam
[jira] [Commented] (HIVE-5186) Remove JoinReducerProc from ReduceSinkDeDuplication
[ https://issues.apache.org/jira/browse/HIVE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755011#comment-13755011 ] Hive QA commented on HIVE-5186: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600798/HIVE-5186.1.patch.txt {color:green}SUCCESS:{color} +1 2902 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/574/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/574/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Remove JoinReducerProc from ReduceSinkDeDuplication --- Key: HIVE-5186 URL: https://issues.apache.org/jira/browse/HIVE-5186 Project: Hive Issue Type: Improvement Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-5186.1.patch.txt Correlation Optimizer will take care patterns involving JoinOperator. We can remove JoinReducerProc from ReduceSinkDeDuplication. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4844) Add varchar data type
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4844: - Description: Add new varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Char type will be added as another task. was: Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Add varchar data type - Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, screenshot.png Add new varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Char type will be added as another task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4844) Add varchar data type
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4844: - Summary: Add varchar data type (was: Add char/varchar data types) Add varchar data type - Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, screenshot.png Add new char/varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5158) allow getting all partitions for table to also use direct SQL path
[ https://issues.apache.org/jira/browse/HIVE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755009#comment-13755009 ] Sergey Shelukhin commented on HIVE-5158: The existing logic to get all partitions doesn't actually appear to have any... it has max parameter but no offset. It does make sense to have it though. Let me take a look, probably in a separate jira. Might require API change allow getting all partitions for table to also use direct SQL path -- Key: HIVE-5158 URL: https://issues.apache.org/jira/browse/HIVE-5158 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5158.D12573.1.patch, HIVE-5158.D12573.2.patch, HIVE-5158.D12573.3.patch, HIVE-5158.D12573.4.patch While testing some queries I noticed that getPartitions can be very slow (which happens e.g. in non-strict mode with no partition column filter); with a table with many partitions it can take 10-12s easily. SQL perf path can also be used for this path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5133) webhcat jobs that need to access metastore fails in secure mode
[ https://issues.apache.org/jira/browse/HIVE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755052#comment-13755052 ] Deepesh Khandelwal commented on HIVE-5133: -- The current patch does fix the Pig but the MR job would fail to find missing HCat classes in classpath, see HIVE-5188. webhcat jobs that need to access metastore fails in secure mode --- Key: HIVE-5133 URL: https://issues.apache.org/jira/browse/HIVE-5133 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-5133.1.patch Webhcat job submission requests result in the pig/hive/mr job being run from a map task that it launches. In secure mode, for the pig/hive/mr job that is run to be authorized to perform actions on metastore, it has to have the delegation tokens from the hive metastore. In case of pig/MR job this is needed if hcatalog is being used in the script/job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5177) Wincompat : Retrying handler related changes
[ https://issues.apache.org/jira/browse/HIVE-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-5177: --- Status: Open (was: Patch Available) Investigating failure found by the precommit test Wincompat : Retrying handler related changes Key: HIVE-5177 URL: https://issues.apache.org/jira/browse/HIVE-5177 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5177.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5133) webhcat jobs that need to access metastore fails in secure mode
[ https://issues.apache.org/jira/browse/HIVE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated HIVE-5133: - Attachment: HIVE-5133.1.test.patch Attaching a supplementary patch containing the E2E test for running a Pig job using HCatLoader/HCatStorer through WebHCat. webhcat jobs that need to access metastore fails in secure mode --- Key: HIVE-5133 URL: https://issues.apache.org/jira/browse/HIVE-5133 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-5133.1.patch, HIVE-5133.1.test.patch Webhcat job submission requests result in the pig/hive/mr job being run from a map task that it launches. In secure mode, for the pig/hive/mr job that is run to be authorized to perform actions on metastore, it has to have the delegation tokens from the hive metastore. In case of pig/MR job this is needed if hcatalog is being used in the script/job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5158) allow getting all partitions for table to also use direct SQL path
[ https://issues.apache.org/jira/browse/HIVE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755025#comment-13755025 ] Edward Capriolo commented on HIVE-5158: --- {quote} The existing logic to get all partitions doesn't actually appear to have any. {/quote} The partitions are ordered so you can start at an existing one and read N partitions. I am not sure it is related to your issue but I just wanted to remind everyone that if you doing two levels of partitions the number adds up very fast and can OOM the client, or the thrift server. allow getting all partitions for table to also use direct SQL path -- Key: HIVE-5158 URL: https://issues.apache.org/jira/browse/HIVE-5158 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5158.D12573.1.patch, HIVE-5158.D12573.2.patch, HIVE-5158.D12573.3.patch, HIVE-5158.D12573.4.patch While testing some queries I noticed that getPartitions can be very slow (which happens e.g. in non-strict mode with no partition column filter); with a table with many partitions it can take 10-12s easily. SQL perf path can also be used for this path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5189) make batching in partition retrieval in metastore applicable to more methods
Sergey Shelukhin created HIVE-5189: -- Summary: make batching in partition retrieval in metastore applicable to more methods Key: HIVE-5189 URL: https://issues.apache.org/jira/browse/HIVE-5189 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin As indicated in HIVE-5158, Metastore can OOM if retrieving a large number of partitions. For client-side partition filtering, the client applies batching (that would avoid that) by sending parts of the filtered name list in separate request according to configuration. The batching is not used on filter pushdown path, and when retrieving all partitions (e.g. when the pruner expression is not useful in non-strict mode). HIVE-4914 and pushdown improvements will make this problem somewhat worse by allowing more requests to go to the server. There needs to be some batching scheme (ideally, a somewhat generic one) that would be applicable to all these paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5189) make batching in partition retrieval in metastore applicable to more methods
[ https://issues.apache.org/jira/browse/HIVE-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-5189: --- Component/s: Metastore make batching in partition retrieval in metastore applicable to more methods Key: HIVE-5189 URL: https://issues.apache.org/jira/browse/HIVE-5189 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin As indicated in HIVE-5158, Metastore can OOM if retrieving a large number of partitions. For client-side partition filtering, the client applies batching (that would avoid that) by sending parts of the filtered name list in separate request according to configuration. The batching is not used on filter pushdown path, and when retrieving all partitions (e.g. when the pruner expression is not useful in non-strict mode). HIVE-4914 and pushdown improvements will make this problem somewhat worse by allowing more requests to go to the server. There needs to be some batching scheme (ideally, a somewhat generic one) that would be applicable to all these paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4617) ExecuteStatementAsync call to run a query in non-blocking mode
[ https://issues.apache.org/jira/browse/HIVE-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755071#comment-13755071 ] Thejas M Nair commented on HIVE-4617: - [~vaibhavgumashta] The new phabricator link is missing your changes to hive-default.xml.template . Can you make sure that there are no other missing files/changes ? ExecuteStatementAsync call to run a query in non-blocking mode -- Key: HIVE-4617 URL: https://issues.apache.org/jira/browse/HIVE-4617 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Jaideep Dhok Assignee: Vaibhav Gumashta Attachments: HIVE-4617.D12417.1.patch, HIVE-4617.D12417.2.patch, HIVE-4617.D12417.3.patch, HIVE-4617.D12417.4.patch, HIVE-4617.D12417.5.patch, HIVE-4617.D12417.6.patch, HIVE-4617.D12507.1.patch, HIVE-4617.D12507Test.1.patch Provide a way to run a queries asynchronously. Current executeStatement call blocks until the query run is complete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5029) direct SQL perf optimization cannot be tested well
[ https://issues.apache.org/jira/browse/HIVE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755074#comment-13755074 ] Hudson commented on HIVE-5029: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #78 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/78/]) HIVE-5029 : direct SQL perf optimization cannot be tested well (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1518953) * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java * /hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/VerifyingObjectStore.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java direct SQL perf optimization cannot be tested well -- Key: HIVE-5029 URL: https://issues.apache.org/jira/browse/HIVE-5029 Project: Hive Issue Type: Test Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.12.0 Attachments: HIVE-5029.D12483.1.patch, HIVE-5029.D12483.2.patch, HIVE-5029.patch, HIVE-5029.patch HIVE-4051 introduced perf optimization that involves getting partitions directly via SQL in metastore. Given that SQL queries might not work on all datastores (and will not work on non-SQL ones), JDO fallback is in place. Given that perf improvement is very large for short queries, it's on by default. However, there's a problem with tests with regard to that. If SQL code is broken, tests may fall back to JDO and pass. If JDO code is broken, SQL might allow tests to pass. We are going to disable SQL by default before the testing problem is resolved. There are several possible solultions: 1) Separate build for this setting. Seems like an overkill... 2) Enable by default; disable by default in tests, create a clone of TestCliDriver with a subset of queries that will exercise the SQL path. 3) Have some sort of test hook inside metastore that will run both ORM and SQL and compare. 3') Or make a subclass of ObjectStore that will do that. ObjectStore is already pluggable. 4) Write unit tests for one of the modes (JDO, as non-default?) and declare that they are sufficient; disable fallback in tests. 3' seems like the easiest. For now we will disable SQL by default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5185) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless
[ https://issues.apache.org/jira/browse/HIVE-5185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-5185: --- Summary: test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless (was: test query file reduce_deduplicate_exclude_gby.q is useless) test query files reduce_deduplicate_exclude_gby.q and reducesink_dedup.q are useless Key: HIVE-5185 URL: https://issues.apache.org/jira/browse/HIVE-5185 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor The file contains {code:sql} create table t1( key_int1 int, key_int2 int, key_string1 string, key_string2 string); set hive.optimize.reducededuplication=false; set hive.map.aggr=false; select Q1.key_int1, sum(Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1; drop table t1; {code} Since the table is not populated, there is no result will be in the .out file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755059#comment-13755059 ] Mohammad Kamrul Islam commented on HIVE-1511: - Sure. Made some more progress. Will upload a new WIP patch soon. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Assignee: Mohammad Kamrul Islam Attachments: generated_plan.xml, HIVE-1511.4.patch, HIVE-1511.5.patch, HIVE-1511.6.patch, HIVE-1511.7.patch, HIVE-1511.8.patch, HIVE-1511.patch, HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, HIVE-1511-wip.patch, KryoHiveTest.java, run.sh As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755143#comment-13755143 ] Brock Noland commented on HIVE-1511: FWIW the changes to Kryo also fixed my JDK7 problems. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Assignee: Mohammad Kamrul Islam Attachments: generated_plan.xml, HIVE-1511.4.patch, HIVE-1511.5.patch, HIVE-1511.6.patch, HIVE-1511.7.patch, HIVE-1511.8.patch, HIVE-1511.patch, HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, HIVE-1511-wip.patch, KryoHiveTest.java, run.sh As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5158) allow getting all partitions for table to also use direct SQL path
[ https://issues.apache.org/jira/browse/HIVE-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755144#comment-13755144 ] Hive QA commented on HIVE-5158: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600721/HIVE-5158.D12573.4.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/575/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/575/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-575/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1519088. At revision 1519088. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. allow getting all partitions for table to also use direct SQL path -- Key: HIVE-5158 URL: https://issues.apache.org/jira/browse/HIVE-5158 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-5158.D12573.1.patch, HIVE-5158.D12573.2.patch, HIVE-5158.D12573.3.patch, HIVE-5158.D12573.4.patch While testing some queries I noticed that getPartitions can be very slow (which happens e.g. in non-strict mode with no partition column filter); with a table with many partitions it can take 10-12s easily. SQL perf path can also be used for this path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5190) partition retrieval via JDO is inconsistent between by-name and by-filter method
Sergey Shelukhin created HIVE-5190: -- Summary: partition retrieval via JDO is inconsistent between by-name and by-filter method Key: HIVE-5190 URL: https://issues.apache.org/jira/browse/HIVE-5190 Project: Hive Issue Type: Bug Components: Metastore Reporter: Sergey Shelukhin Priority: Minor When we get partitions by name, we call retrieveAll, forcing the retrieval of all fields of the partition. Retrieving by filter has: {code} // pm.retrieveAll(mparts); // retrieveAll is pessimistic. some fields may not be needed {code} The code moved around recently but it was there for a while. This may cause inconsistent results from two methods in terms of what fields are set. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4844) Add varchar data type
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755136#comment-13755136 ] Jason Dere commented on HIVE-4844: -- All three failures mentioned above (TestHCatStorerMulti, TestNegativeMinimrCliDriver, TestHCatLoaderComplexSchema) appear to pass when I run these tests myself. Add varchar data type - Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, screenshot.png Add new varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Char type will be added as another task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4914) filtering via partition name should be done inside metastore server (implementation)
[ https://issues.apache.org/jira/browse/HIVE-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755156#comment-13755156 ] Phabricator commented on HIVE-4914: --- sershe has commented on the revision HIVE-4914 [jira] filtering via partition name should be done inside metastore server (implementation). INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:496 TreeVisitor is part of the tree, and this is logically the part of metastoresql... I think it makes sense to not expose it, it's a private class metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1887 that's in separate patch that is not in yet. will adjust either way when resolving conflict metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1944 the old methods are still there; old versions of hive client, or some other clients, might call them. I will add this to the comment metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1965 this is old JDO code, it just moved metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:1971 filed HIVE-5190 metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java:5756 I think this should be separate JIRA to refactor. I can move JDO partition retrieval into separate class, but ideally it should be a refactoring patch without code changes, otherwise it's hard to understand what broke if something does metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:113 This class needs access to internals, and both are very small by themselves... REVISION DETAIL https://reviews.facebook.net/D12561 BRANCH HIVE-4914-no-gen ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe filtering via partition name should be done inside metastore server (implementation) Key: HIVE-4914 URL: https://issues.apache.org/jira/browse/HIVE-4914 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4914.01.patch, HIVE-4914.D12561.1.patch, HIVE-4914-only-no-gen.patch, HIVE-4914-only.patch, HIVE-4914.patch, HIVE-4914.patch, HIVE-4914.patch Currently, if the filter pushdown is impossible (which is most cases), the client gets all partition names from metastore, filters them, and asks for partitions by names for the filtered set. Metastore server code should do that instead; it should check if pushdown is possible and do it if so; otherwise it should do name-based filtering. Saves the roundtrip with all partition names from the server to client, and also removes the need to have pushdown viability checking on both sides. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5159) Change the kind fields in ORC's proto file to optional
[ https://issues.apache.org/jira/browse/HIVE-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755114#comment-13755114 ] Jason Dere commented on HIVE-5159: -- Just want to get a bit of clarification on this issue, since it goes against the required is forever guideline described in https://developers.google.com/protocol-buffers/docs/proto#simple. If I understand correctly, the issue is when an older client receives newer Type value, containing a Kind enum that didn't exist at the time the older client was built. In this situation the Kind value is set as NULL. Is the point of setting Kind as optional field of Type so that you would have the ability to call hasKind() on the Type and do appropriate error handling? Looking at OrcProto.java it appears that there is such a Type.hasKind() method already, so this use case is already covered. Or is there another use case that you are thinking of here? Change the kind fields in ORC's proto file to optional -- Key: HIVE-5159 URL: https://issues.apache.org/jira/browse/HIVE-5159 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Jason Dere Java's protobuf generated code uses a null value to represent enum values that were added after the reader was compiled. To reflect that reality, the enum values should always be marked as optional. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-5182) log more stuff via PerfLogger
[ https://issues.apache.org/jira/browse/HIVE-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-5182: -- Assignee: Sergey Shelukhin log more stuff via PerfLogger - Key: HIVE-5182 URL: https://issues.apache.org/jira/browse/HIVE-5182 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin PerfLogger output is useful in understanding perf. There are large gaps in it, however, and it's not clear what is going on during these. Some sections are large and have no breakdown. It would be nice to add more stuff. At this point I'm not certain where exactly, whoever makes the patch (me?) will just need to look at the above gaps and fill them in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5191) Add char data type
Jason Dere created HIVE-5191: Summary: Add char data type Key: HIVE-5191 URL: https://issues.apache.org/jira/browse/HIVE-5191 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Separate task for char type, since HIVE-4844 only adds varchar -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5187) Enhance explain to indicate vectorized execution of operators.
[ https://issues.apache.org/jira/browse/HIVE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5187: --- Status: Patch Available (was: Open) Enhance explain to indicate vectorized execution of operators. -- Key: HIVE-5187 URL: https://issues.apache.org/jira/browse/HIVE-5187 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5187.1.patch Explain should be able to indicate whether an operator will be executed in vectorized mode or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5187) Enhance explain to indicate vectorized execution of operators.
[ https://issues.apache.org/jira/browse/HIVE-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5187: --- Attachment: HIVE-5187.1.patch Enhance explain to indicate vectorized execution of operators. -- Key: HIVE-5187 URL: https://issues.apache.org/jira/browse/HIVE-5187 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5187.1.patch Explain should be able to indicate whether an operator will be executed in vectorized mode or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5192) concat()/concat_ws() can use common implementation
Jason Dere created HIVE-5192: Summary: concat()/concat_ws() can use common implementation Key: HIVE-5192 URL: https://issues.apache.org/jira/browse/HIVE-5192 Project: Hive Issue Type: New Feature Reporter: Jason Dere One of the review comments from HIVE-4844 mentioned that concat/concat_ws can probably share a common implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5192) concat()/concat_ws() can use common implementation
[ https://issues.apache.org/jira/browse/HIVE-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5192: - Issue Type: Improvement (was: New Feature) concat()/concat_ws() can use common implementation -- Key: HIVE-5192 URL: https://issues.apache.org/jira/browse/HIVE-5192 Project: Hive Issue Type: Improvement Reporter: Jason Dere One of the review comments from HIVE-4844 mentioned that concat/concat_ws can probably share a common implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5172) TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server
[ https://issues.apache.org/jira/browse/HIVE-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] agate updated HIVE-5172: Status: Patch Available (was: Open) Please review this patch. Its a minor change as noted in description above. Checks transMap for cached TUGIContainingTransport, creates a new object and adds it cache, and returns it (doesnt look in the cache after put, where the possibility is that it wont be there if the entry has been garbage collected). TUGIContainingTransport returning null transport, causing intermittent SocketTimeoutException on hive client and NullPointerException in TUGIBasedProcessor on the server - Key: HIVE-5172 URL: https://issues.apache.org/jira/browse/HIVE-5172 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0, 0.10.0, 0.9.0 Reporter: agate Attachments: HIVE-5172.1.patch.txt We are running into frequent problem using HCatalog 0.4.1 (Hive Metastore Server 0.9) where we get connection reset or connection timeout errors on the client and NullPointerException in TUGIBasedProcessor on the server. {code} hive client logs: = org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:2136) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:2122) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openStore(HiveMetaStoreClient.java:286) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:197) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:157) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:830) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:954) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7524) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ... 31 more {code} {code} hive metastore server logs: === 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message. java.lang.NullPointerException at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:79) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176) at
[jira] [Updated] (HIVE-5161) Additional SerDe support for varchar type
[ https://issues.apache.org/jira/browse/HIVE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5161: - Attachment: HIVE-5161.2.patch Attaching HIVE-5161.2.patch. This updates a few of the SerDe's to use the HiveVarcharWritable's Text member directly, rather than converting to a String before serializing. Additional SerDe support for varchar type - Key: HIVE-5161 URL: https://issues.apache.org/jira/browse/HIVE-5161 Project: Hive Issue Type: Bug Components: Serializers/Deserializers, Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5161.1.patch, HIVE-5161.2.patch Breaking out support for varchar for the various SerDes as an additional task. NO PRECOMMIT TESTS - can't run tests until HIVE-4844 is committed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Hive 0.12 release
Would like to see HIVE-4844 make it in if possible. Thanks, Jason On Aug 29, 2013, at 10:31 PM, Eugene Koifman ekoif...@hortonworks.com wrote: I think we should make sure that several items under HIVE-4869 get checked in before branching. Eugene On Thu, Aug 29, 2013 at 9:18 PM, Thejas Nair the...@hortonworks.com wrote: It has been more than 3 months since 0.11 was released and we already have 294 jiras in resolved-fixed state for 0.12. This includes several new features such as date data type, optimizer improvements, ORC format improvements and many bug fixes. There are also many features look ready to get committed soon such as the varchar type. I think it is time to start preparing for a 0.12 release by creating a branch later next week and start stabilizing it. What do people think about it ? As we get closer to the branching, we can start discussing any additional features/bug fixes that we should add to the release and start monitoring their progress. Thanks, Thejas -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data
[ https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755256#comment-13755256 ] Sushanth Sowmyan commented on HIVE-4969: Hi, could you please attach a testcase that tests this as well? That way, the tests(including your test) fails without your fix, and succeeds with your fix. Also, as a general note, the HBaseHCatStorageHandler is about to be deprecated in favour of the hive's HBaseStorageHandler with HIVE-4331. HCatalog HBaseHCatStorageHandler is not returning all the data -- Key: HIVE-4969 URL: https://issues.apache.org/jira/browse/HIVE-4969 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Venki Korukanti Priority: Critical Fix For: 0.11.1, 0.12.0 Attachments: HIVE-4969-1.patch Repro steps: 1) Create an HCatalog table mapped to HBase table. hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float) STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler' TBLPROPERTIES('hbase.table.name' ='studentHBase', 'hbase.columns.mapping' = ':key,onecf:name,twocf:age,threecf:gpa'); 2) Load the following data from Pig. cat student_data 1^Asarah laertes^A23^A2.40 2^Atom allen^A72^A1.57 3^Abob ovid^A61^A2.67 4^Aethan nixon^A38^A2.15 5^Acalvin robinson^A28^A2.53 6^Airene ovid^A65^A2.56 7^Ayuri garcia^A36^A1.65 8^Acalvin nixon^A41^A1.04 9^Ajessica davidson^A48^A2.11 10^Akatie king^A39^A1.05 grunt A = LOAD 'student_data' AS (rownum:int,name:chararray,age:int,gpa:float); grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer(); 3) Now from HBase do a scan on the studentHBase table hbase(main):026:0 scan 'studentPig', {LIMIT = 5} 4) From pig access the data in table grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader(); grunt STORE A INTO '/user/root/studentPig'; 5) Verify the output written in StudentPig hadoop fs -cat /user/root/studentPig/part-r-0 1 23 2 72 3 61 4 38 5 28 6 65 7 36 8 41 9 48 10 39 The data returned has only two fields (rownum and age). Problem: While reading the data from HBase table, HbaseSnapshotRecordReader gets data row in Result (org.apache.hadoop.hbase.client.Result) object and processes the KeyValue fields in it. After processing, it creates another Result object out of the processed KeyValue array. Problem here is KeyValue array is not sorted. Result object expects the input KeyValue array to have sorted elements. When we call the Result.getValue() it returns no value for some of the fields as it does a binary search on un-ordered array. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5193) Columnar Pushdown for RC/ORC Filenot happening in HCatLoader
Viraj Bhat created HIVE-5193: Summary: Columnar Pushdown for RC/ORC Filenot happening in HCatLoader Key: HIVE-5193 URL: https://issues.apache.org/jira/browse/HIVE-5193 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.11.0, 0.10.0 Reporter: Viraj Bhat Assignee: Viraj Bhat Fix For: 0.11.1, 0.12.0 Currently the HCatLoader is not taking advantage of the ColumnProjectionUtils. where it could skip columns during read. The information is available in Pig it just needs to get to the Readers. Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5193) Columnar Pushdown for RC/ORC File not happening in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated HIVE-5193: - Summary: Columnar Pushdown for RC/ORC File not happening in HCatLoader (was: Columnar Pushdown for RC/ORC Filenot happening in HCatLoader ) Columnar Pushdown for RC/ORC File not happening in HCatLoader -- Key: HIVE-5193 URL: https://issues.apache.org/jira/browse/HIVE-5193 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.10.0, 0.11.0 Reporter: Viraj Bhat Assignee: Viraj Bhat Fix For: 0.11.1, 0.12.0 Currently the HCatLoader is not taking advantage of the ColumnProjectionUtils. where it could skip columns during read. The information is available in Pig it just needs to get to the Readers. Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5192) concat()/concat_ws() can use common implementation
[ https://issues.apache.org/jira/browse/HIVE-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755264#comment-13755264 ] Jason Dere commented on HIVE-5192: -- There's actually some differences in what arguments are allowed by the two functions, so merging concat() and concat_ws() would mean the following changes in behavior for each function: concat() would allow lists of primitive types as arguments; each item of each list would be concatenated to the result string. concat_ws() would allow non-string types as arguments, so concat_ws('|', 1, 2) would be valid. concat()/concat_ws() can use common implementation -- Key: HIVE-5192 URL: https://issues.apache.org/jira/browse/HIVE-5192 Project: Hive Issue Type: Improvement Reporter: Jason Dere One of the review comments from HIVE-4844 mentioned that concat/concat_ws can probably share a common implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5193) Columnar Pushdown for RC/ORC File not happening in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755267#comment-13755267 ] Viraj Bhat commented on HIVE-5193: -- Submitting a patch to address this issue with a testcase. Columnar Pushdown for RC/ORC File not happening in HCatLoader -- Key: HIVE-5193 URL: https://issues.apache.org/jira/browse/HIVE-5193 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.10.0, 0.11.0 Reporter: Viraj Bhat Assignee: Viraj Bhat Fix For: 0.11.1, 0.12.0 Currently the HCatLoader is not taking advantage of the ColumnProjectionUtils. where it could skip columns during read. The information is available in Pig it just needs to get to the Readers. Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5193) Columnar Pushdown for RC/ORC File not happening in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated HIVE-5193: - Attachment: HIVE-5193.patch Patch for addressing the issue Columnar Pushdown for RC/ORC File not happening in HCatLoader -- Key: HIVE-5193 URL: https://issues.apache.org/jira/browse/HIVE-5193 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.10.0, 0.11.0 Reporter: Viraj Bhat Assignee: Viraj Bhat Fix For: 0.11.1, 0.12.0 Attachments: HIVE-5193.patch Currently the HCatLoader is not taking advantage of the ColumnProjectionUtils. where it could skip columns during read. The information is available in Pig it just needs to get to the Readers. Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5178) Wincompat : QTestUtil changes
[ https://issues.apache.org/jira/browse/HIVE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755274#comment-13755274 ] Hive QA commented on HIVE-5178: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600817/HIVE-5178.2.patch {color:green}SUCCESS:{color} +1 2902 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/577/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/577/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Wincompat : QTestUtil changes - Key: HIVE-5178 URL: https://issues.apache.org/jira/browse/HIVE-5178 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5178.2.patch, HIVE-5178.patch Miscellaneous QTestUtil changes are needed to make tests work under windows: a) Aux jars needed to be set up for minimr b) Ignore empty test lines if windows -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5193) Columnar Pushdown for RC/ORC File not happening in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755273#comment-13755273 ] Viraj Bhat commented on HIVE-5193: -- Review board link: https://reviews.facebook.net/D12633 Columnar Pushdown for RC/ORC File not happening in HCatLoader -- Key: HIVE-5193 URL: https://issues.apache.org/jira/browse/HIVE-5193 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.10.0, 0.11.0 Reporter: Viraj Bhat Assignee: Viraj Bhat Fix For: 0.11.1, 0.12.0 Attachments: HIVE-5193.patch Currently the HCatLoader is not taking advantage of the ColumnProjectionUtils. where it could skip columns during read. The information is available in Pig it just needs to get to the Readers. Viraj -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4778) hive.server2.authentication CUSTOM not working
[ https://issues.apache.org/jira/browse/HIVE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755275#comment-13755275 ] Mikhail Antonov commented on HIVE-4778: --- I also encountered the same issue today, and obviously, simplest workaround is to use the following in hive-site.xml: property !--namehive.server2.custom.authentication.class/name-- nameHIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS/name valuecom.zettaset.hive.security.auth.ZtsHiveAuthenticationProvider/value /property That works. hive.server2.authentication CUSTOM not working -- Key: HIVE-4778 URL: https://issues.apache.org/jira/browse/HIVE-4778 Project: Hive Issue Type: Bug Components: Authentication Affects Versions: 0.11.0 Environment: CentOS release 6.2 x86_64 java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Zdenek Ott Assignee: Azrael Attachments: HIVE-4778.D12207.1.patch, HIVE-4778.D12213.1.patch I have created my own class PamAuthenticationProvider that implements PasswdAuthenticationProvider interface. I have puted jar into hive lib directory and have configured hive-site.xml in following way: property namehive.server2.authentication/name valueCUSTOM/value /property property namehive.server2.custom.authentication.class/name valuecom.avast.ff.hive.PamAuthenticationProvider/value /property I use SQuireL and jdbc drivers to connect to hive. During authentication Hive throws following exception: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.init(CustomAuthenticationProviderImpl.java:20) at org.apache.hive.service.auth.AuthenticationProviderFactory.getAuthenticationProvider(AuthenticationProviderFactory.java:57) at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:61) at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:127) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:509) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:264) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 12 more I have done small patch for org.apache.hive.service.auth.CustomAuthenticationProviderImpl , that have solved my problem, but I'm not sure if it's the best solution. Here is the patch: --- CustomAuthenticationProviderImpl.java 2013-06-20 14:55:22.473995184 +0200 +++ CustomAuthenticationProviderImpl.java.new 2013-06-20 14:57:36.549012966 +0200 @@ -33,7 +33,7 @@ HiveConf conf = new HiveConf(); this.customHandlerClass = (Class? extends PasswdAuthenticationProvider) conf.getClass( - HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.name(), + HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.varname, PasswdAuthenticationProvider.class); this.customProvider = ReflectionUtils.newInstance(this.customHandlerClass, conf); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4778) hive.server2.authentication CUSTOM not working
[ https://issues.apache.org/jira/browse/HIVE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755277#comment-13755277 ] Mikhail Antonov commented on HIVE-4778: --- Also, one of the reasons for people to write custom authenticators is that default LDAP provider has limitations (bugs?) which prevent it from working with some OpenLDAP servers. Not sure if there's a bug opened for that (usage of uid is hardcoded) hive.server2.authentication CUSTOM not working -- Key: HIVE-4778 URL: https://issues.apache.org/jira/browse/HIVE-4778 Project: Hive Issue Type: Bug Components: Authentication Affects Versions: 0.11.0 Environment: CentOS release 6.2 x86_64 java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: Zdenek Ott Assignee: Azrael Attachments: HIVE-4778.D12207.1.patch, HIVE-4778.D12213.1.patch I have created my own class PamAuthenticationProvider that implements PasswdAuthenticationProvider interface. I have puted jar into hive lib directory and have configured hive-site.xml in following way: property namehive.server2.authentication/name valueCUSTOM/value /property property namehive.server2.custom.authentication.class/name valuecom.avast.ff.hive.PamAuthenticationProvider/value /property I use SQuireL and jdbc drivers to connect to hive. During authentication Hive throws following exception: java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:128) at org.apache.hive.service.auth.CustomAuthenticationProviderImpl.init(CustomAuthenticationProviderImpl.java:20) at org.apache.hive.service.auth.AuthenticationProviderFactory.getAuthenticationProvider(AuthenticationProviderFactory.java:57) at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:61) at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:127) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:509) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:264) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoSuchMethodException: org.apache.hive.service.auth.PasswdAuthenticationProvider.init() at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getDeclaredConstructor(Class.java:1985) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:122) ... 12 more I have done small patch for org.apache.hive.service.auth.CustomAuthenticationProviderImpl , that have solved my problem, but I'm not sure if it's the best solution. Here is the patch: --- CustomAuthenticationProviderImpl.java 2013-06-20 14:55:22.473995184 +0200 +++ CustomAuthenticationProviderImpl.java.new 2013-06-20 14:57:36.549012966 +0200 @@ -33,7 +33,7 @@ HiveConf conf = new HiveConf(); this.customHandlerClass = (Class? extends PasswdAuthenticationProvider) conf.getClass( - HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.name(), + HiveConf.ConfVars.HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS.varname, PasswdAuthenticationProvider.class); this.customProvider = ReflectionUtils.newInstance(this.customHandlerClass, conf); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4844) Add varchar data type
[ https://issues.apache.org/jira/browse/HIVE-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755294#comment-13755294 ] Jason Dere commented on HIVE-4844: -- ashutoshc, like Xuefu, has suggested that this patch be split into different subtasks where appropriate, to make review/tracking easier. I'll take a look at what I can do here. At first glance, looks like this can be done as the following set of changes: 1. getMethodInternal() should prefer evaluate() signatures with more similar arguments 2. Change getCommonClass/implicitConvertible to use PrimitiveCategory rather than TypeInfo 3. Cast operators need to be set with type-specific data prior to initialization 4. varchar work (will be done in this Jira) 5. Thrift/JDBC changes for varchar Add varchar data type - Key: HIVE-4844 URL: https://issues.apache.org/jira/browse/HIVE-4844 Project: Hive Issue Type: New Feature Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4844.10.patch, HIVE-4844.11.patch, HIVE-4844.1.patch.hack, HIVE-4844.2.patch, HIVE-4844.3.patch, HIVE-4844.4.patch, HIVE-4844.5.patch, HIVE-4844.6.patch, HIVE-4844.7.patch, HIVE-4844.8.patch, HIVE-4844.9.patch, screenshot.png Add new varchar data types which have support for more SQL-compliant behavior, such as SQL string comparison semantics, max length, etc. Char type will be added as another task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5129) Multiple table insert fails on count(distinct)
[ https://issues.apache.org/jira/browse/HIVE-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5129: - Status: Open (was: Patch Available) Multiple table insert fails on count(distinct) -- Key: HIVE-5129 URL: https://issues.apache.org/jira/browse/HIVE-5129 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: aggrTestMultiInsertData1.txt, aggrTestMultiInsertData.txt, aggrTestMultiInsert.q, HIVE-5129.1.patch.txt, HIVE-5129.2.WIP.patch.txt, HIVE-5129.3.patch.txt, HIVE-5129.4.patch.txt Hive fails with a class cast exception on queries of the form: {noformat} from studenttab10k insert overwrite table multi_insert_2_1 select name, avg(age) as avgage group by name insert overwrite table multi_insert_2_2 select name, age, sum(gpa) as sumgpa group by name, age insert overwrite table multi_insert_2_3 select name, count(distinct age) as distage group by name; {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira