[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Adachi updated HIVE-2137: - Fix Version/s: 0.10.1 Status: Patch Available (was: Open) JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.10.1 Attachments: HIVE-2137.patch JDBC driver decode string by client encoding. It ignore server encoding. For example, server = Linux (utf-8) client = Windows (shift-jis : it's japanese charset) It makes character corruption in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Adachi updated HIVE-2137: - Fix Version/s: (was: 0.11.1) (was: 0.10.1) Status: Patch Available (was: Open) JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver decode string by client encoding. It ignore server encoding. For example, server = Linux (utf-8) client = Windows (shift-jis : it's japanese charset) It makes character corruption in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Adachi updated HIVE-2137: - Fix Version/s: 0.12.0 0.11.1 Status: Open (was: Patch Available) JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.10.1, 0.11.1, 0.12.0 Attachments: HIVE-2137.patch JDBC driver decode string by client encoding. It ignore server encoding. For example, server = Linux (utf-8) client = Windows (shift-jis : it's japanese charset) It makes character corruption in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4927) When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed
[ https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720484#comment-13720484 ] Phabricator commented on HIVE-4927: --- ashutoshc has accepted the revision HIVE-4927 [jira] When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed. +1 REVISION DETAIL https://reviews.facebook.net/D11811 BRANCH HIVE-4927 ARCANIST PROJECT hive To: JIRA, ashutoshc, yhuai When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed Key: HIVE-4927 URL: https://issues.apache.org/jira/browse/HIVE-4927 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch {code} set hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask=true; EXPLAIN SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 y2 ON (x1.value = y2.value) GROUP BY x1.key; {\code} We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator tree of the MapRedTask evaluating two MapJoins is {code} TS1-MapJoin1-TS2-MapJoin2-... {\code} We should remove the TS2... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4055) add Date data type
[ https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720495#comment-13720495 ] Jason Dere commented on HIVE-4055: -- Looks like I had added both date and timestamp support to RegexSerDe. I'll update the diff accordingly. Both Jdbc tests work for me. Well, actually they seem to fail (with or without this diff) if they're the first tests I run, but if I run a qfile before running the Jdbc tests they seem to pass fine. add Date data type -- Key: HIVE-4055 URL: https://issues.apache.org/jira/browse/HIVE-4055 Project: Hive Issue Type: Sub-task Components: JDBC, Query Processor, Serializers/Deserializers, UDF Reporter: Sun Rui Assignee: Jason Dere Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, HIVE-4055.3.patch.txt, HIVE-4055.D11547.1.patch Add Date data type, a new primitive data type which supports the standard SQL date type. Basically, the implementation can take HIVE-2272 and HIVE-2957 as references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4055) add Date data type
[ https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4055: - Attachment: HIVE-4055.4.patch.txt New patch HIVE-4055.4.patch.txt, with updated unit test output for serde_regex.q add Date data type -- Key: HIVE-4055 URL: https://issues.apache.org/jira/browse/HIVE-4055 Project: Hive Issue Type: Sub-task Components: JDBC, Query Processor, Serializers/Deserializers, UDF Reporter: Sun Rui Assignee: Jason Dere Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, HIVE-4055.3.patch.txt, HIVE-4055.4.patch.txt, HIVE-4055.D11547.1.patch Add Date data type, a new primitive data type which supports the standard SQL date type. Basically, the implementation can take HIVE-2272 and HIVE-2957 as references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4927) When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed
[ https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720573#comment-13720573 ] Hive QA commented on HIVE-4927: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594308/HIVE-4927.D11811.3.patch {color:green}SUCCESS:{color} +1 2652 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/189/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/189/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed Key: HIVE-4927 URL: https://issues.apache.org/jira/browse/HIVE-4927 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch {code} set hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask=true; EXPLAIN SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 y2 ON (x1.value = y2.value) GROUP BY x1.key; {\code} We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator tree of the MapRedTask evaluating two MapJoins is {code} TS1-MapJoin1-TS2-MapJoin2-... {\code} We should remove the TS2... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name
[ https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720604#comment-13720604 ] Hive QA commented on HIVE-4299: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594323/HIVE-4299.1.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/190/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/190/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests failed with: IllegalStateException: Too many bad hosts: 1.0% (10 / 10) is greater than threshold of 50% {noformat} This message is automatically generated. exported metadata by HIVE-3068 cannot be imported because of wrong file name Key: HIVE-4299 URL: https://issues.apache.org/jira/browse/HIVE-4299 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Sho Shimauchi Assignee: Sho Shimauchi Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch h2. Symptom When DROP TABLE a table, metadata of the table is generated to be able to import the dropped table again. However, the exported metadata name is 'table name.metadata'. Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, user have to rename the metadata file to import the table. h2. How to reproduce Set the following setting to hive-site.xml: {code} property namehive.metastore.pre.event.listeners/name valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value /property {code} Then run the following queries: {code} CREATE TABLE test_table (id INT, name STRING); DROP TABLE test_table; IMPORT TABLE test_table_imported FROM '/path/to/metadata/file'; FAILED: SemanticException [Error 10027]: Invalid path {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3756) LOAD DATA does not honor permission inheritence
[ https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720661#comment-13720661 ] Chaoyu Tang commented on HIVE-3756: --- [~ashutoshc] [~sushanth] Thanks for the review. LOAD DATA does not honor permission inheritence - Key: HIVE-3756 URL: https://issues.apache.org/jira/browse/HIVE-3756 Project: Hive Issue Type: Bug Components: Authorization, Security Affects Versions: 0.9.0 Reporter: Johndee Burks Assignee: Chaoyu Tang Fix For: 0.12.0 Attachments: HIVE-3756_1.patch, HIVE-3756_2.patch, HIVE-3756.patch When a LOAD DATA operation is performed the resulting data in hdfs for the table does not maintain permission inheritance. This remains true even with the hive.warehouse.subdir.inherit.perms set to true. The issue is easily reproducible by creating a table and loading some data into it. After the load is complete just do a dfs -ls -R on the warehouse directory and you will see that the inheritance of permissions worked for the table directory but not for the data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4683) fix coverage org.apache.hadoop.hive.cli
[ https://issues.apache.org/jira/browse/HIVE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated HIVE-4683: --- Status: Patch Available (was: Open) fix coverage org.apache.hadoop.hive.cli --- Key: HIVE-4683 URL: https://issues.apache.org/jira/browse/HIVE-4683 Project: Hive Issue Type: Bug Affects Versions: 0.10.1, 0.11.1, 0.12.0 Reporter: Aleksey Gorshkov Assignee: Aleksey Gorshkov Attachments: HIVE-4683-branch-0.10.patch, HIVE-4683-branch-0.10-v1.patch, HIVE-4683-branch-0.11-v1.patch, HIVE-4683-trunk.patch, HIVE-4683-trunk-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-2137: - Description: JDBC driver decodes string by client side default encoding, which depends on operating system. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. was: JDBC driver decode string by client encoding. It ignore server encoding. For example, server = Linux (utf-8) client = Windows (shift-jis : it's japanese charset) It makes character corruption in the client. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver decodes string by client side default encoding, which depends on operating system. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-2137: - Description: JDBC driver decodes string by client side default encoding, which depends on operating system unless we specify . It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. was: JDBC driver decodes string by client side default encoding, which depends on operating system. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver decodes string by client side default encoding, which depends on operating system unless we specify . It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-2137: - Description: JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. was: JDBC driver decodes string by client side default encoding, which depends on operating system unless we specify . It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-2137: - Description: JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. was: JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4935) Potential NPE in MetadataOnlyOptimizer
[ https://issues.apache.org/jira/browse/HIVE-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720721#comment-13720721 ] Hive QA commented on HIVE-4935: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594321/HIVE-4935.1.patch {color:green}SUCCESS:{color} +1 2652 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/191/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/191/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. Potential NPE in MetadataOnlyOptimizer -- Key: HIVE-4935 URL: https://issues.apache.org/jira/browse/HIVE-4935 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-4935.1.patch, HIVE-4935.1.patch In MetadataOnlyOptimizer.TableScanProcessor.process, it is possible that we consider a TableScanOperator as MayBeMetadataOnly when this TS does not have a conf. In MetadataOnlyOptimizer.MetadataOnlyTaskDispatcher.dispatch(Node, StackNode, Object...), when we convert this TS, we want to get the alias from its conf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-2137: - Description: JDBC driver for HiveServer1 decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. was: JDBC driver decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver for HiveServer1 decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
Also i believe hcatalog web can fall into the same designation. Question , hcatalog was initily a big hive-metastore fork. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. What is the status on that? I remember that was one of the core reasons we brought it in. On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote: I prefer option 3 as well. On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote: On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have been developing my laptop on a duel core 2 GB Ram laptop for years now. With the addition of hcatalog, hive-thrift2, and some other growth trying to develop hive in a eclipse on this machine craws, especially if 'build automatically' is turned on. As we look to add on more things this is only going to get worse. I am also noticing issues like this: https://issues.apache.org/jira/browse/HIVE-4849 What I think we should do is strip down/out optional parts of hive. 1) Hive Hbase This should really be it's own project to do this right we really have to have multiple branches since hbase is not backwards compatible. 2) Hive Web Interface Now really a big project but not really critical can be just as easily be build separately 3) hive thrift 1 We have hive thrift 2 now, it is time for the sun to set on hivethrift1, 4) odbc Not entirely convinced about this one but it is really not critical to running hive. What I think we should do is create sub-projects for the above things or simply move them into directories that do not build with hive. Ideally they would use maven to pull dependencies. What does everyone think? I agree that projects like the HBase handler and probably others as well should somehow be downstream projects which simply depend on the hive jars. I see a couple alternatives for this: * Take the module in question to the Apache Incubator * Move the module in question to the Apache Extras * Breakup the projects within our own source tree I'd prefer the third option at this point. Brock Brock
Re: Extending Explode
So the explode output should be in the order of the array or the map you are exploding. I think you could use rank or row-sequence to give the exploded attay a number. If that does not wok adding a new udf might make more sence then extwnding in this case.: On Friday, July 26, 2013, nikolaus.st...@researchgate.net wrote: Hi, I'd like to make a patch that extends the functionality of explode to include an output column with the position of each item in the original array. I imagine this could be useful to the greater community and am wondering if I should extend the current explode function or if I should write a completely new function. Any thoughts on what will be more useful and more likely to be added to the hive-trunk would be greatly appreciated. Thanks, Niko
[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working
[ https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720802#comment-13720802 ] Hive QA commented on HIVE-3926: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594320/HIVE-3926.D8121.5.patch {color:green}SUCCESS:{color} +1 2653 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/192/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/192/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. PPD on virtual column of partitioned table is not working - Key: HIVE-3926 URL: https://issues.apache.org/jira/browse/HIVE-3926 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch {code} select * from src where BLOCK__OFFSET__INSIDE__FILE100; {code} is working, but {code} select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100; {code} throws SemanticException. Disabling PPD makes it work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4935) Potential NPE in MetadataOnlyOptimizer
[ https://issues.apache.org/jira/browse/HIVE-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4935: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Yin! Potential NPE in MetadataOnlyOptimizer -- Key: HIVE-4935 URL: https://issues.apache.org/jira/browse/HIVE-4935 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4935.1.patch, HIVE-4935.1.patch In MetadataOnlyOptimizer.TableScanProcessor.process, it is possible that we consider a TableScanOperator as MayBeMetadataOnly when this TS does not have a conf. In MetadataOnlyOptimizer.MetadataOnlyTaskDispatcher.dispatch(Node, StackNode, Object...), when we convert this TS, we want to get the alias from its conf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3632) Upgrade datanucleus to support JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3632: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Xuefu! Upgrade datanucleus to support JDK7 --- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.1, 0.10.0, 0.11.0 Reporter: Chris Drome Assignee: Xuefu Zhang Priority: Critical Fix For: 0.12.0 Attachments: HIVE-3632.1.patch, HIVE-3632.2.patch, HIVE-3632.3.patch, HIVE-3632.patch, HIVE-3632.patch.1 I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. Test Properties: CLEAR LIBRARY CACHE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2084) Upgrade datanucleus from 2.0.3 to a more recent version (3.?)
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-2084. Resolution: Fixed Fix Version/s: 0.12.0 This has been fixed via HIVE-3632 Upgrade datanucleus from 2.0.3 to a more recent version (3.?) - Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2084.D2397.1.patch, HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D5685.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2473) Hive throws an NPE when $HADOOP_HOME points to a tarball install directory that contains a build/ subdirectory.
[ https://issues.apache.org/jira/browse/HIVE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-2473. Resolution: Fixed Fix Version/s: 0.12.0 This should have been fixed via HIVE-3632 If you can still reproduce, feel free to reopen. Hive throws an NPE when $HADOOP_HOME points to a tarball install directory that contains a build/ subdirectory. --- Key: HIVE-2473 URL: https://issues.apache.org/jira/browse/HIVE-2473 Project: Hive Issue Type: Bug Environment: hadoop-0.20.204.0 Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.12.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2015) Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages
[ https://issues.apache.org/jira/browse/HIVE-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-2015. Resolution: Fixed Fix Version/s: 0.12.0 This should have been fixed via HIVE-3632 Feel free to reopen if you can still reproduce. Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages Key: HIVE-2015 URL: https://issues.apache.org/jira/browse/HIVE-2015 Project: Hive Issue Type: Bug Components: Diagnosability, Metastore Reporter: Carl Steinbach Assignee: Zhenxiao Luo Fix For: 0.12.0 Every time I start up the Hive CLI with logging enabled I'm treated to the following ERROR log messages courtesy of DataNucleus: {code} DEBUG metastore.ObjectStore: datanucleus.plugin.pluginRegistryBundleCheck = LOG ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.core.resources but it cannot be resolved. ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.core.runtime but it cannot be resolved. ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.text but it cannot be resolved. {code} Here's where this comes from: * The bin/hive scripts cause Hive to inherit Hadoop's classpath. * Hadoop's classpath includes $HADOOP_HOME/lib/core-3.1.1.jar, an Eclipse library. * core-3.1.1.jar includes a plugin.xml file defining an OSGI plugin * At startup, Datanucleus scans the classpath looking for OSGI plugins, and will attempt to initialize any that it finds, including the Eclipse OSGI plugins located in core-3.1.1.jar * Initialization of the OSGI plugin in core-3.1.1.jar fails because of unresolved dependencies. * We see an ERROR message telling us that Datanucleus failed to initialize a plugin that we don't care about in the first place. I can think of two options for solving this problem: # Rewrite the scripts in $HIVE_HOME/bin so that they don't inherit ALL of Hadoop's CLASSPATH. # Replace DataNucleus's NOnManagedPluginRegistry with our own implementation that does nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4878) With Dynamic partitioning, some queries would scan default partition even if query is not using it.
[ https://issues.apache.org/jira/browse/HIVE-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4878: --- Resolution: Fixed Fix Version/s: (was: 0.11.1) 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, John! With Dynamic partitioning, some queries would scan default partition even if query is not using it. --- Key: HIVE-4878 URL: https://issues.apache.org/jira/browse/HIVE-4878 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.12.0 Attachments: HIVE-4878.patch With Dynamic partitioning, Hive would scan default partitions in some cases even if query excludes it. As part of partition pruning, predicate is narrowed down to those pieces that involve partition columns only. This predicate is then evaluated with partition values to determine, if scan should include those partitions. But in some cases (like when comparing __HIVE_DEFAULT_PARTITION__ to numeric data types) expression evaluation would fail and would return NULL instead of true/false. In such cases the partition is added to unknown partitions which is then subsequently scanned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4927) When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed
[ https://issues.apache.org/jira/browse/HIVE-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4927: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Yin! When we merge two MapJoin MapRedTasks, the TableScanOperator of the second one should be removed Key: HIVE-4927 URL: https://issues.apache.org/jira/browse/HIVE-4927 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Fix For: 0.12.0 Attachments: HIVE-4927.D11811.1.patch, HIVE-4927.D11811.2.patch, HIVE-4927.D11811.3.patch, HIVE-4927.D11811.3.patch {code} set hive.auto.convert.join=true; set hive.auto.convert.join.noconditionaltask=true; EXPLAIN SELECT x1.key AS key FROM src x1 JOIN src1 y1 ON (x1.key = y1.key) JOIN src1 y2 ON (x1.value = y2.value) GROUP BY x1.key; {\code} We will get a NPE from MetadataOnlyOptimizer. The reason is that the operator tree of the MapRedTask evaluating two MapJoins is {code} TS1-MapJoin1-TS2-MapJoin2-... {\code} We should remove the TS2... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
Yin Huai created HIVE-4942: -- Summary: Fix eclipse template files to use correct datanucleus libs Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances
[ https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720915#comment-13720915 ] Brock Noland commented on HIVE-4920: I think we should also be able to block the test until prices come back down and then allocate new hosts during the middle of the test. PTest2 spot instances should fall back on c1.xlarge and then on-demand instances Key: HIVE-4920 URL: https://issues.apache.org/jira/browse/HIVE-4920 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Critical Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png Today the price for m1.xlarge instances has been varying dramatically. We should fall back on c1.xlarge (which is more powerful and is cheaper at present) and then on on-demand instances. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720916#comment-13720916 ] Yin Huai commented on HIVE-4942: We need HIVE-2739 Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: HIVE-4942.1.patch Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Attachments: HIVE-4942.1.patch HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: (was: HIVE-4942.patch) Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: HIVE-4942.patch Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: (was: HIVE-4942.1.patch) Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: (was: HIVE-4942) Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: HIVE-4942 Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Attachment: HIVE-4942.txt Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Attachments: HIVE-4942.txt HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4942: --- Status: Patch Available (was: Open) Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Attachments: HIVE-4942.txt HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances
[ https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720922#comment-13720922 ] Edward Capriolo commented on HIVE-4920: --- Are you trying to build a high frequency trading system or a test suite :) JK. We should be optimal but lets not spend too much time gaming the system. I think we are better off following up with the discussion on list of chopping up the project. PTest2 spot instances should fall back on c1.xlarge and then on-demand instances Key: HIVE-4920 URL: https://issues.apache.org/jira/browse/HIVE-4920 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Critical Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png Today the price for m1.xlarge instances has been varying dramatically. We should fall back on c1.xlarge (which is more powerful and is cheaper at present) and then on on-demand instances. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4943) An explode function that includes the item's position in the array
Niko Stahl created HIVE-4943: Summary: An explode function that includes the item's position in the array Key: HIVE-4943 URL: https://issues.apache.org/jira/browse/HIVE-4943 Project: Hive Issue Type: New Feature Reporter: Niko Stahl A function that explodes an array and includes an output column with the position of each item in the original array. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4512) The vectorized plan is not picking right expression class for string concatenation.
[ https://issues.apache.org/jira/browse/HIVE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720945#comment-13720945 ] Hive QA commented on HIVE-4512: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594218/HIVE-4512.2-vectorization.patch Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/193/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/193/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests failed with: IllegalStateException: Too many bad hosts: 0.6% (6 / 10) is greater than threshold of 50% {noformat} This message is automatically generated. The vectorized plan is not picking right expression class for string concatenation. --- Key: HIVE-4512 URL: https://issues.apache.org/jira/browse/HIVE-4512 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Jitendra Nath Pandey Assignee: Eric Hanson Attachments: HIVE-4512.1-vectorization.patch, HIVE-4512.2-vectorization.patch The vectorized plan is not picking right expression class for string concatenation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Extending Explode
FYI, Brickhouse provides a numeric_range UDTF, which explodes integer values, and an array_index UDF, so you could solve your problem by exploding on a numeric range of the size of the array. ie. select n, array_index(arr, n ) from mytable lateral view numeric_range(0, size(arr) -1 ) n1 as n ; -- jerome On Fri, Jul 26, 2013 at 6:40 AM, Edward Capriolo edlinuxg...@gmail.comwrote: So the explode output should be in the order of the array or the map you are exploding. I think you could use rank or row-sequence to give the exploded attay a number. If that does not wok adding a new udf might make more sence then extwnding in this case.: On Friday, July 26, 2013, nikolaus.st...@researchgate.net wrote: Hi, I'd like to make a patch that extends the functionality of explode to include an output column with the position of each item in the original array. I imagine this could be useful to the greater community and am wondering if I should extend the current explode function or if I should write a completely new function. Any thoughts on what will be more useful and more likely to be added to the hive-trunk would be greatly appreciated. Thanks, Niko
[jira] [Commented] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720968#comment-13720968 ] Ashutosh Chauhan commented on HIVE-4942: +1 Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Attachments: HIVE-4942.txt HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
+1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat as set of libraries that let pig and java MR use hive metastore. Since hcat is closely tied to hive-metastore, it makes sense to have them in same project. On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Also i believe hcatalog web can fall into the same designation. Question , hcatalog was initily a big hive-metastore fork. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. What is the status on that? I remember that was one of the core reasons we brought it in. On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote: I prefer option 3 as well. On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote: On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have been developing my laptop on a duel core 2 GB Ram laptop for years now. With the addition of hcatalog, hive-thrift2, and some other growth trying to develop hive in a eclipse on this machine craws, especially if 'build automatically' is turned on. As we look to add on more things this is only going to get worse. I am also noticing issues like this: https://issues.apache.org/jira/browse/HIVE-4849 What I think we should do is strip down/out optional parts of hive. 1) Hive Hbase This should really be it's own project to do this right we really have to have multiple branches since hbase is not backwards compatible. 2) Hive Web Interface Now really a big project but not really critical can be just as easily be build separately 3) hive thrift 1 We have hive thrift 2 now, it is time for the sun to set on hivethrift1, 4) odbc Not entirely convinced about this one but it is really not critical to running hive. What I think we should do is create sub-projects for the above things or simply move them into directories that do not build with hive. Ideally they would use maven to pull dependencies. What does everyone think? I agree that projects like the HBase handler and probably others as well should somehow be downstream projects which simply depend on the hive jars. I see a couple alternatives for this: * Take the module in question to the Apache Incubator * Move the module in question to the Apache Extras * Breakup the projects within our own source tree I'd prefer the third option at this point. Brock Brock
[jira] [Updated] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-4929: --- Attachment: HIVE-4929.patch the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-4929: --- Status: Patch Available (was: Open) the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720998#comment-13720998 ] Sergey Shelukhin commented on HIVE-4929: Review uploaded to RB (Phabricator errors out, seemingly due to size) the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721006#comment-13721006 ] Ashutosh Chauhan commented on HIVE-4929: [~sershe] Can you post the RB link here ? the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721029#comment-13721029 ] Sergey Shelukhin commented on HIVE-4929: https://reviews.apache.org/r/12974/ the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721052#comment-13721052 ] Ashutosh Chauhan commented on HIVE-4929: +1 the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12795: [HIVE-4827] Merge a Map-only job to its following MapReduce job with multiple inputs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12795/ --- (Updated July 26, 2013, 6:50 p.m.) Review request for hive. Changes --- update test results Bugs: HIVE-4827 https://issues.apache.org/jira/browse/HIVE-4827 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-4827 Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8f1af64 conf/hive-default.xml.template 69b85dc eclipse-templates/.classpath 44e6c62 eclipse-templates/.classpath._hbase 397918d ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 66b84ff ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java b5a9291 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java 5340d3c ql/src/test/queries/clientpositive/auto_join33.q 5c85842 ql/src/test/queries/clientpositive/correlationoptimizer1.q 2adf855 ql/src/test/queries/clientpositive/correlationoptimizer3.q fcbb764 ql/src/test/queries/clientpositive/correlationoptimizer4.q 0e84cb7 ql/src/test/queries/clientpositive/correlationoptimizer5.q 1900f5d ql/src/test/queries/clientpositive/correlationoptimizer6.q 88d790c ql/src/test/queries/clientpositive/correlationoptimizer7.q 9b18972 ql/src/test/queries/clientpositive/multiMapJoin1.q 86b0586 ql/src/test/queries/clientpositive/multiMapJoin2.q PRE-CREATION ql/src/test/queries/clientpositive/union34.q a88e395 ql/src/test/results/clientpositive/auto_join0.q.out 652cb76 ql/src/test/results/clientpositive/auto_join10.q.out deb8eb5 ql/src/test/results/clientpositive/auto_join11.q.out 939f512 ql/src/test/results/clientpositive/auto_join12.q.out 23ed0fc ql/src/test/results/clientpositive/auto_join13.q.out 7e0f41d ql/src/test/results/clientpositive/auto_join15.q.out aa40cff ql/src/test/results/clientpositive/auto_join16.q.out e8f1435 ql/src/test/results/clientpositive/auto_join2.q.out a11f347 ql/src/test/results/clientpositive/auto_join20.q.out 13722ec ql/src/test/results/clientpositive/auto_join21.q.out 79693fe ql/src/test/results/clientpositive/auto_join22.q.out 6f418db ql/src/test/results/clientpositive/auto_join23.q.out 2755ee1 ql/src/test/results/clientpositive/auto_join24.q.out c7e872e ql/src/test/results/clientpositive/auto_join26.q.out 7268755 ql/src/test/results/clientpositive/auto_join28.q.out 407303c ql/src/test/results/clientpositive/auto_join29.q.out dec2187 ql/src/test/results/clientpositive/auto_join32.q.out 312664a ql/src/test/results/clientpositive/auto_join33.q.out 8fc0e84 ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out da375f6 ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 42e25fa ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 2ec3cf3 ql/src/test/results/clientpositive/auto_sortmerge_join_9.q.out 6add99a ql/src/test/results/clientpositive/correlationoptimizer1.q.out db3bd78 ql/src/test/results/clientpositive/correlationoptimizer3.q.out cfa7eff ql/src/test/results/clientpositive/correlationoptimizer4.q.out 285a54f ql/src/test/results/clientpositive/correlationoptimizer6.q.out b0438e6 ql/src/test/results/clientpositive/correlationoptimizer7.q.out f8db2bf ql/src/test/results/clientpositive/join28.q.out 60165e2 ql/src/test/results/clientpositive/join32.q.out af37f54 ql/src/test/results/clientpositive/join33.q.out af37f54 ql/src/test/results/clientpositive/join_star.q.out 797b892 ql/src/test/results/clientpositive/mapjoin_filter_on_outerjoin.q.out ca21c6c ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 2f5f613 ql/src/test/results/clientpositive/mapjoin_subquery.q.out 8243c2c ql/src/test/results/clientpositive/mapjoin_subquery2.q.out 292abe4 ql/src/test/results/clientpositive/mapjoin_test_outer.q.out 37817d9 ql/src/test/results/clientpositive/multiMapJoin1.q.out a3f5c53 ql/src/test/results/clientpositive/multiMapJoin2.q.out PRE-CREATION ql/src/test/results/clientpositive/multi_join_union.q.out 5182bdf ql/src/test/results/clientpositive/union34.q.out 166062a Diff: https://reviews.apache.org/r/12795/diff/ Testing --- Running tests. Thanks, Yin Huai
[jira] [Updated] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4827: --- Status: Patch Available (was: Open) Merge a Map-only job to its following MapReduce job with multiple inputs Key: HIVE-4827 URL: https://issues.apache.org/jira/browse/HIVE-4827 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, HIVE-4827.4.patch When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a Map-only job (MapJoin) to its following MapReduce job. But this merge only happens when the MapReduce job has a single input. With Correlation Optimizer (HIVE-2206), it is possible that the MapReduce job can have multiple inputs (for multiple operation paths). It is desired to improve CommonJoinResolver to merge a Map-only job to the corresponding Map task of the MapReduce job. Example: {code:sql} set hive.optimize.correlation=true; set hive.auto.convert.join=true; set hive.optimize.mapjoin.mapreduce=true; SELECT tmp1.key, count(*) FROM (SELECT x1.key1 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) GROUP BY x1.key1) tmp1 JOIN (SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2) GROUP BY x2.key2) tmp2 ON (tmp1.key = tmp2.key) GROUP BY tmp1.key; {\code} In this query, join operations inside tmp1 and tmp2 will be converted to two MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of tmp1 and tmp2, and the last aggregation will be executed in the same MapReduce job (Reduce side). Since this MapReduce job has two inputs, right now, CommonJoinResolver cannot attach two MapJoins to the Map side of a MapReduce job. Another example: {code:sql} SELECT tmp1.key FROM (SELECT x1.key2 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) UNION ALL SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1 {\code} For this case, we will have three Map-only jobs (two for MapJoins and one for Union). It will be good to use a single Map-only job to execute this query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4827: --- Attachment: HIVE-4827.4.patch Merge a Map-only job to its following MapReduce job with multiple inputs Key: HIVE-4827 URL: https://issues.apache.org/jira/browse/HIVE-4827 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, HIVE-4827.4.patch When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a Map-only job (MapJoin) to its following MapReduce job. But this merge only happens when the MapReduce job has a single input. With Correlation Optimizer (HIVE-2206), it is possible that the MapReduce job can have multiple inputs (for multiple operation paths). It is desired to improve CommonJoinResolver to merge a Map-only job to the corresponding Map task of the MapReduce job. Example: {code:sql} set hive.optimize.correlation=true; set hive.auto.convert.join=true; set hive.optimize.mapjoin.mapreduce=true; SELECT tmp1.key, count(*) FROM (SELECT x1.key1 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) GROUP BY x1.key1) tmp1 JOIN (SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2) GROUP BY x2.key2) tmp2 ON (tmp1.key = tmp2.key) GROUP BY tmp1.key; {\code} In this query, join operations inside tmp1 and tmp2 will be converted to two MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of tmp1 and tmp2, and the last aggregation will be executed in the same MapReduce job (Reduce side). Since this MapReduce job has two inputs, right now, CommonJoinResolver cannot attach two MapJoins to the Map side of a MapReduce job. Another example: {code:sql} SELECT tmp1.key FROM (SELECT x1.key2 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) UNION ALL SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1 {\code} For this case, we will have three Map-only jobs (two for MapJoins and one for Union). It will be good to use a single Map-only job to execute this query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working
[ https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721060#comment-13721060 ] Phabricator commented on HIVE-3926: --- hagleitn has accepted the revision HIVE-3926 [jira] PPD on virtual column of partitioned table is not working. LGTM REVISION DETAIL https://reviews.facebook.net/D8121 BRANCH HIVE-3926 ARCANIST PROJECT hive To: JIRA, hagleitn, navis Cc: hagleitn PPD on virtual column of partitioned table is not working - Key: HIVE-3926 URL: https://issues.apache.org/jira/browse/HIVE-3926 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch {code} select * from src where BLOCK__OFFSET__INSIDE__FILE100; {code} is working, but {code} select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100; {code} throws SemanticException. Disabling PPD makes it work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4827) Merge a Map-only job to its following MapReduce job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721062#comment-13721062 ] Yin Huai commented on HIVE-4827: Updated patch has been uploaded to RB. Mark it as PA to trigger precommit tests. Merge a Map-only job to its following MapReduce job with multiple inputs Key: HIVE-4827 URL: https://issues.apache.org/jira/browse/HIVE-4827 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.12.0 Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4827.1.patch, HIVE-4827.2.patch, HIVE-4827.3.patch, HIVE-4827.4.patch When hive.optimize.mapjoin.mapreduce is on, CommonJoinResolver can attach a Map-only job (MapJoin) to its following MapReduce job. But this merge only happens when the MapReduce job has a single input. With Correlation Optimizer (HIVE-2206), it is possible that the MapReduce job can have multiple inputs (for multiple operation paths). It is desired to improve CommonJoinResolver to merge a Map-only job to the corresponding Map task of the MapReduce job. Example: {code:sql} set hive.optimize.correlation=true; set hive.auto.convert.join=true; set hive.optimize.mapjoin.mapreduce=true; SELECT tmp1.key, count(*) FROM (SELECT x1.key1 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) GROUP BY x1.key1) tmp1 JOIN (SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key2 = y2.key2) GROUP BY x2.key2) tmp2 ON (tmp1.key = tmp2.key) GROUP BY tmp1.key; {\code} In this query, join operations inside tmp1 and tmp2 will be converted to two MapJoins. With Correlation Optimizer, aggregations in tmp1, tmp2, and join of tmp1 and tmp2, and the last aggregation will be executed in the same MapReduce job (Reduce side). Since this MapReduce job has two inputs, right now, CommonJoinResolver cannot attach two MapJoins to the Map side of a MapReduce job. Another example: {code:sql} SELECT tmp1.key FROM (SELECT x1.key2 AS key FROM bigTable1 x1 JOIN smallTable1 y1 ON (x1.key1 = y1.key1) UNION ALL SELECT x2.key2 AS key FROM bigTable2 x2 JOIN smallTable2 y2 ON (x2.key1 = y2.key1)) tmp1 {\code} For this case, we will have three Map-only jobs (two for MapJoins and one for Union). It will be good to use a single Map-only job to execute this query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721070#comment-13721070 ] Gunther Hagleitner commented on HIVE-4470: -- Admin flag sounds good to me too. {quote} By performance penalty, do you mean the increased latency because of MR job launching? {quote} It's much worse than that. There's no way right now to run the local stage of a map join anywhere but on the client machine, which is the HS2 machine in this case. So, you could either disable map joins altogether for HS2 through admin flag (which means really expensive shuffle joins for everything), or do the work to be able to run the hash table gen in the cluster, which makes this ticket really huge. HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails
[ https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721075#comment-13721075 ] Gunther Hagleitner commented on HIVE-4343: -- I think we should move forward with this and consider HIVE-4470 as orthogonal. People might still want to run local work on the HS2. I agree that this is potentially dangerous and probably not a good default, but on HIVE-4470 the recommendation is to have a admin flag for on/off. IMO, this ticket should still go in. HS2 with kerberos- local task for map join fails Key: HIVE-4343 URL: https://issues.apache.org/jira/browse/HIVE-4343 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4343.1.patch With hive server2 configured with kerberos security, when a (map) join query is run, it results in failure with GSSException: No valid credentials provided -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4892) PTest2 cleanup after merge
[ https://issues.apache.org/jira/browse/HIVE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721107#comment-13721107 ] Brock Noland commented on HIVE-4892: Hey, This patch contain some deletes and therefore left some empty files when it was applied. We should execute an addendum commit: {noformat} svn rm ./testutils/ptest2/src/test/resources/test-outputs/TEST-SomeTest-truncated.xml ./testutils/ptest2/src/test/resources/test-outputs/TEST-skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml ./testutils/ptest2/src/test/resources/test-outputs/TEST-union_remove_9.q-acb9de8f-1b9c-4874-924c-b2107ca7b07c-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml ./testutils/ptest2/src/test/resources/test-outputs/TEST-skewjoin_union_remove_1.q-6fa31776-d2b0-4e13-9761-11f750627ad1-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml ./testutils/ptest2/src/test/resources/test-outputs/TEST-index_auth.q-bucketcontex-ba31fb54-1d7f-4c70-a89d-477b7d155191-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml ./testutils/ptest2/src/test/resources/TEST-SomeTest-failure.xml {noformat} PTest2 cleanup after merge -- Key: HIVE-4892 URL: https://issues.apache.org/jira/browse/HIVE-4892 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4892.patch, HIVE-4892.patch HIVE-4675 was merged but there are still a few minor issues we need to cleanup: * README is out of date * Need to limit the number of failed source directories we copy back from the slaves * when looking for TEST-*.xml files we look at both the log directory (good) and the failed source directories (bad) therefore duplicating failures in jenkins report * We need to process bad hosts in the finally block of PTest.run (HIVE-4882) * Need a mechanism to clean the ivy and maven cache (HIVE-4882) * PTest2 fails to publish a comment to a JIRA sometimes (HIVE-4889) * Now that PTest2 is committed to the source tree it's copying in our TEST-SomeTest*.xml files Test Properties: NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-305) Port Hadoop streaming's counters/status reporters to Hive Transforms
[ https://issues.apache.org/jira/browse/HIVE-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721110#comment-13721110 ] Brock Noland commented on HIVE-305: --- +1 Port Hadoop streaming's counters/status reporters to Hive Transforms Key: HIVE-305 URL: https://issues.apache.org/jira/browse/HIVE-305 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer Assignee: Guo Hongjie Attachments: HIVE-305.1.patch, HIVE-305.2.patch, hive-305.3.diff.txt, HIVE-305.patch.txt https://issues.apache.org/jira/browse/HADOOP-1328 Introduced a way for a streaming process to update global counters and status using stderr stream to emit information. Use reporter:counter:group,counter,amount to update a counter. Use reporter:status:message to update status. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2137) JDBC driver doesn't encode string properly.
[ https://issues.apache.org/jira/browse/HIVE-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721104#comment-13721104 ] Hive QA commented on HIVE-2137: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12501499/HIVE-2137.patch {color:green}SUCCESS:{color} +1 2653 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/194/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/194/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. JDBC driver doesn't encode string properly. --- Key: HIVE-2137 URL: https://issues.apache.org/jira/browse/HIVE-2137 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.9.0 Reporter: Jin Adachi Fix For: 0.12.0 Attachments: HIVE-2137.patch JDBC driver for HiveServer1 decodes string by client side default encoding, which depends on operating system unless we don't specify another encoding. It ignore server side encoding. For example, when server side operating system and encoding are Linux (utf-8) and client side operating system and encoding are Windows (shift-jis : it's japanese charset, makes character corruption happens in the client. In current implementation of Hive, UTF-8 appears to be expected in server side so client side should encode/decode string as UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721117#comment-13721117 ] Edward Capriolo commented on HIVE-4470: --- Adding an option is nice, but I do not see how it is enforceable since HiveConf can be changed by the user. HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs
[ https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721125#comment-13721125 ] Jeff Wu commented on HIVE-1545: --- Trying to compile these and load the jar but the package com.facebook.hive.udf.tests isn't included. Can someone attach that? Add a bunch of UDFs and UDAFs - Key: HIVE-1545 URL: https://issues.apache.org/jira/browse/HIVE-1545 Project: Hive Issue Type: New Feature Components: UDF Reporter: Jonathan Chang Assignee: Jonathan Chang Priority: Minor Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, UDFStartsWith.java, UDFTrim.java Here some UD(A)Fs which can be incorporated into the Hive distribution: UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns 1. UDFBucket - Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x b_{i} but = b_{i+1}. Returns 0 if x is smaller than all the buckets. UDFFindInArray - Finds the 1-index of the first element in the array given as the second argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0. UDFGreatCircleDist - Finds the great circle distance (in km) between two lat/long coordinates (in degrees). UDFLDA - Performs LDA inference on a vector given fixed topics. UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 whenever any of its parameters changes. UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5. UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches in an array. UDFUnescape - Returns the string unescaped (using C/Java style unescaping). UDFWhich - Given a boolean array, return the indices which are TRUE. UDFJaccard UDAFCollect - Takes all the values associated with a row and converts it into a list. Make sure to have: set hive.map.aggr = false; UDAFCollectMap - Like collect except that it takes tuples and generates a map. UDAFEntropy - Compute the entropy of a column. UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two columns. UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value of VAL. UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated with the N (passed as the third parameter) largest values of VAL. UDAFHistogram -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause
[ https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4928: - Status: Patch Available (was: Open) Date literals do not work properly in partition spec clause --- Key: HIVE-4928 URL: https://issues.apache.org/jira/browse/HIVE-4928 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4928.1.patch.txt The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause
[ https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4928: - Attachment: HIVE-4928.1.patch.txt Patch changes the parsing of the date literal so the DATELITERAL contains the date string value. This makes it more consistent with the ASTNodes generated for the other type literals. Date literals do not work properly in partition spec clause --- Key: HIVE-4928 URL: https://issues.apache.org/jira/browse/HIVE-4928 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4928.1.patch.txt The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs
[ https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721143#comment-13721143 ] Jonathan Chang commented on HIVE-1545: -- I think they should be migrated to use the equivalent facilities in the PDK? Add a bunch of UDFs and UDAFs - Key: HIVE-1545 URL: https://issues.apache.org/jira/browse/HIVE-1545 Project: Hive Issue Type: New Feature Components: UDF Reporter: Jonathan Chang Assignee: Jonathan Chang Priority: Minor Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, UDFStartsWith.java, UDFTrim.java Here some UD(A)Fs which can be incorporated into the Hive distribution: UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns 1. UDFBucket - Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x b_{i} but = b_{i+1}. Returns 0 if x is smaller than all the buckets. UDFFindInArray - Finds the 1-index of the first element in the array given as the second argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0. UDFGreatCircleDist - Finds the great circle distance (in km) between two lat/long coordinates (in degrees). UDFLDA - Performs LDA inference on a vector given fixed topics. UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 whenever any of its parameters changes. UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5. UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches in an array. UDFUnescape - Returns the string unescaped (using C/Java style unescaping). UDFWhich - Given a boolean array, return the indices which are TRUE. UDFJaccard UDAFCollect - Takes all the values associated with a row and converts it into a list. Make sure to have: set hive.map.aggr = false; UDAFCollectMap - Like collect except that it takes tuples and generates a map. UDAFEntropy - Compute the entropy of a column. UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two columns. UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value of VAL. UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated with the N (passed as the third parameter) largest values of VAL. UDAFHistogram -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs
[ https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721144#comment-13721144 ] Jonathan Chang commented on HIVE-1545: -- For the time being you can remove those packages (and the corresponding annotations) without affecting the functionality. Add a bunch of UDFs and UDAFs - Key: HIVE-1545 URL: https://issues.apache.org/jira/browse/HIVE-1545 Project: Hive Issue Type: New Feature Components: UDF Reporter: Jonathan Chang Assignee: Jonathan Chang Priority: Minor Attachments: core.tar.gz, ext.tar.gz, UDFEndsWith.java, UDFFindInString.java, UDFLtrim.java, UDFRtrim.java, udfs.tar.gz, udfs.tar.gz, UDFStartsWith.java, UDFTrim.java Here some UD(A)Fs which can be incorporated into the Hive distribution: UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 5, 3) returns 1. UDFBucket - Find the bucket in which the first argument belongs. e.g., BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x b_{i} but = b_{i+1}. Returns 0 if x is smaller than all the buckets. UDFFindInArray - Finds the 1-index of the first element in the array given as the second argument. Returns 0 if not found. Returns NULL if either argument is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, array(1,2,3)) will return 0. UDFGreatCircleDist - Finds the great circle distance (in km) between two lat/long coordinates (in degrees). UDFLDA - Performs LDA inference on a vector given fixed topics. UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 whenever any of its parameters changes. UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 5. UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches in an array. UDFUnescape - Returns the string unescaped (using C/Java style unescaping). UDFWhich - Given a boolean array, return the indices which are TRUE. UDFJaccard UDAFCollect - Takes all the values associated with a row and converts it into a list. Make sure to have: set hive.map.aggr = false; UDAFCollectMap - Like collect except that it takes tuples and generates a map. UDAFEntropy - Compute the entropy of a column. UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two columns. UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value of VAL. UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated with the N (passed as the third parameter) largest values of VAL. UDAFHistogram -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4388: --- Attachment: HIVE-4388-wip.txt Attached is a WIP patch not meant for commit. 450KB of the 550KB is the generated protocol buffers class required for the co-processor in HCat. I verified TestHBaseCliDriver with both hadoop1 and hadoop2, therefore I think we are in an decent spot with regards to 0.96 compatibility. Note that the hadoop2 build I had to hack together the classpath as the the upstream hadoop2 snapshot is slightly out of date. From here I need to clean the patch up quite a bit (not a ant/ivy expert so I was hacking away a little) and then do a lot more testing. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721167#comment-13721167 ] Gunther Hagleitner commented on HIVE-4470: -- [~appodictic] Can you explain what you mean by that some more? You mean an admin can set defaults, but we can't make sure someone submitting a query doesn't overwrite it? HiveConf only exists on the server in this case, so does the rest of the planning/submission code. Why wouldn't be be able to limit the user in what they can do? HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working
[ https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721171#comment-13721171 ] Gunther Hagleitner commented on HIVE-3926: -- +1 This looks good. Planning to commit tomorrow. PPD on virtual column of partitioned table is not working - Key: HIVE-3926 URL: https://issues.apache.org/jira/browse/HIVE-3926 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, HIVE-3926.D8121.5.patch {code} select * from src where BLOCK__OFFSET__INSIDE__FILE100; {code} is working, but {code} select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100; {code} throws SemanticException. Disabling PPD makes it work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721179#comment-13721179 ] Edward Capriolo commented on HIVE-4470: --- In HiveThrift1 I can do: {code} client.execute( SET hive.security.authorization.enabled=false); client.execute( SELECT * FROM StuffIamNotSupposedtoSsee); {code} Is there some mechanism in hive thrift2 that prevents set commands? HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals
[ https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2702: -- Attachment: HIVE-2702.D11847.1.patch sershe requested code review of HIVE-2702 [jira] listPartitionsByFilter only supports string partitions for equals. Reviewers: JIRA Rebase on top of HIVE-4929. It should still compile/pass server tests, but won't work properly before HIVE-4929 listPartitionsByFilter supports only non-string partitions. This is because its explicitly specified in generateJDOFilterOverPartitions in ExpressionTree.java. //Can only support partitions whose types are string if( ! table.getPartitionKeys().get(partitionColumnIndex). getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) { throw new MetaException (Filtering is supported only on partition keys of type string); } TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D11847 AFFECTED FILES metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28167/ To: JIRA, sershe listPartitionsByFilter only supports string partitions for equals - Key: HIVE-2702 URL: https://issues.apache.org/jira/browse/HIVE-2702 Project: Hive Issue Type: Bug Affects Versions: 0.8.1 Reporter: Aniket Mokashi Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, HIVE-2702-v0.patch listPartitionsByFilter supports only non-string partitions. This is because its explicitly specified in generateJDOFilterOverPartitions in ExpressionTree.java. //Can only support partitions whose types are string if( ! table.getPartitionKeys().get(partitionColumnIndex). getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) { throw new MetaException (Filtering is supported only on partition keys of type string); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource
[ https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4586: - Status: Open (was: Patch Available) Applying the patch results in a significant number of checkstyle failures. [HCatalog] WebHCat should return 404 error for undefined resource - Key: HIVE-4586 URL: https://issues.apache.org/jira/browse/HIVE-4586 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4586-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721192#comment-13721192 ] Gunther Hagleitner commented on HIVE-4470: -- That is an awesome attack. :-) I thought there's already some black list for certain vars in HiveConf for this case. I'm hoping security enabled/disabled is in that list. HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals
[ https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721198#comment-13721198 ] Ashutosh Chauhan commented on HIVE-2702: Do you want to improve the description of ticket ... something like Enhance listPartitionsByFilter to add support for integral types both for equality and non-equality listPartitionsByFilter only supports string partitions for equals - Key: HIVE-2702 URL: https://issues.apache.org/jira/browse/HIVE-2702 Project: Hive Issue Type: Bug Affects Versions: 0.8.1 Reporter: Aniket Mokashi Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, HIVE-2702-v0.patch listPartitionsByFilter supports only non-string partitions. This is because its explicitly specified in generateJDOFilterOverPartitions in ExpressionTree.java. //Can only support partitions whose types are string if( ! table.getPartitionKeys().get(partitionColumnIndex). getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) { throw new MetaException (Filtering is supported only on partition keys of type string); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4055) add Date data type
[ https://issues.apache.org/jira/browse/HIVE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4055: - Attachment: HIVE-4055.4.patch Re-upload HIVE-4055.4.patch (without .txt suffix), to get automated tests to run. add Date data type -- Key: HIVE-4055 URL: https://issues.apache.org/jira/browse/HIVE-4055 Project: Hive Issue Type: Sub-task Components: JDBC, Query Processor, Serializers/Deserializers, UDF Reporter: Sun Rui Assignee: Jason Dere Attachments: Date.pdf, HIVE-4055.1.patch.txt, HIVE-4055.2.patch.txt, HIVE-4055.3.patch.txt, HIVE-4055.4.patch, HIVE-4055.4.patch.txt, HIVE-4055.D11547.1.patch Add Date data type, a new primitive data type which supports the standard SQL date type. Basically, the implementation can take HIVE-2272 and HIVE-2957 as references. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721208#comment-13721208 ] Thejas M Nair commented on HIVE-4470: - bq. I thought there's already some black list for certain vars in HiveConf for this case. I'm hoping security enabled/disabled is in that list. Yes, you can configure that using hive.conf.restricted.list . But it is empty by default. [~appodictic] That is something that needs to go in the default restricted list ! HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals
[ https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721209#comment-13721209 ] Phabricator commented on HIVE-2702: --- ashutoshc has accepted the revision HIVE-2702 [jira] listPartitionsByFilter only supports string partitions for equals. +1 Looks good. Few minor nits. INLINE COMMENTS metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g:160 Do u want to name this as IntegralLiteral now ? metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java:312 Do you instead want to say in this TODO that this will be dealt in HIVE-4888 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:2188 Do you want to say for 2nd TODO that this will be dealt in HIVE-4888 REVISION DETAIL https://reviews.facebook.net/D11847 BRANCH HIVE-2702-2 ARCANIST PROJECT hive To: JIRA, ashutoshc, sershe listPartitionsByFilter only supports string partitions for equals - Key: HIVE-2702 URL: https://issues.apache.org/jira/browse/HIVE-2702 Project: Hive Issue Type: Bug Affects Versions: 0.8.1 Reporter: Aniket Mokashi Assignee: Sergey Shelukhin Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, HIVE-2702.D11715.3.patch, HIVE-2702.D11847.1.patch, HIVE-2702.patch, HIVE-2702-v0.patch listPartitionsByFilter supports only non-string partitions. This is because its explicitly specified in generateJDOFilterOverPartitions in ExpressionTree.java. //Can only support partitions whose types are string if( ! table.getPartitionKeys().get(partitionColumnIndex). getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) { throw new MetaException (Filtering is supported only on partition keys of type string); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4942) Fix eclipse template files to use correct datanucleus libs
[ https://issues.apache.org/jira/browse/HIVE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4942: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) It would be really nice if someone picks up HIVE-2739. In the meanwhile thanks Yin for the quick fix. Fix eclipse template files to use correct datanucleus libs -- Key: HIVE-4942 URL: https://issues.apache.org/jira/browse/HIVE-4942 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4942.txt HIVE-3632 did not update the eclipse template files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name
[ https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721215#comment-13721215 ] Hive QA commented on HIVE-4299: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594323/HIVE-4299.1.patch.txt {color:green}SUCCESS:{color} +1 2653 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/196/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/196/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. exported metadata by HIVE-3068 cannot be imported because of wrong file name Key: HIVE-4299 URL: https://issues.apache.org/jira/browse/HIVE-4299 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Sho Shimauchi Assignee: Sho Shimauchi Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch h2. Symptom When DROP TABLE a table, metadata of the table is generated to be able to import the dropped table again. However, the exported metadata name is 'table name.metadata'. Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, user have to rename the metadata file to import the table. h2. How to reproduce Set the following setting to hive-site.xml: {code} property namehive.metastore.pre.event.listeners/name valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value /property {code} Then run the following queries: {code} CREATE TABLE test_table (id INT, name STRING); DROP TABLE test_table; IMPORT TABLE test_table_imported FROM '/path/to/metadata/file'; FAILED: SemanticException [Error 10027]: Invalid path {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721222#comment-13721222 ] Ashutosh Chauhan commented on HIVE-4825: One more comment. Sorry missed that one earlier. Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4299) exported metadata by HIVE-3068 cannot be imported because of wrong file name
[ https://issues.apache.org/jira/browse/HIVE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721235#comment-13721235 ] Ashutosh Chauhan commented on HIVE-4299: I guess you have tested it manually on cluster, so skipping unit tests should be alright. Although, I think it will be better to do: + public static final String METADATA_NAME=.metadata; instead of + public static final String METADATA_NAME=_metadata; because some folks may have already exported the data, which cannot be imported with your change but can be if we choose former instead. exported metadata by HIVE-3068 cannot be imported because of wrong file name Key: HIVE-4299 URL: https://issues.apache.org/jira/browse/HIVE-4299 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Sho Shimauchi Assignee: Sho Shimauchi Attachments: HIVE-4299.1.patch.txt, HIVE-4299.patch h2. Symptom When DROP TABLE a table, metadata of the table is generated to be able to import the dropped table again. However, the exported metadata name is 'table name.metadata'. Since ImportSemanticAnalyzer allows only '_metadata' as metadata filename, user have to rename the metadata file to import the table. h2. How to reproduce Set the following setting to hive-site.xml: {code} property namehive.metastore.pre.event.listeners/name valueorg.apache.hadoop.hive.ql.parse.MetaDataExportListener/value /property {code} Then run the following queries: {code} CREATE TABLE test_table (id INT, name STRING); DROP TABLE test_table; IMPORT TABLE test_table_imported FROM '/path/to/metadata/file'; FAILED: SemanticException [Error 10027]: Invalid path {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4470) HS2 should disable local query execution
[ https://issues.apache.org/jira/browse/HIVE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721246#comment-13721246 ] Gunther Hagleitner commented on HIVE-4470: -- Hehe. Yeah, btw: hive.conf.restricted.list should probably also be in the restricted list. HS2 should disable local query execution Key: HIVE-4470 URL: https://issues.apache.org/jira/browse/HIVE-4470 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Hive can run queries in local mode (instead of using a cluster), if the size is small. This happens when hive.exec.mode.local.auto is set to true. This would affect the stability of the hive server2 node, if you have heavy query processing happening on it. Bugs in udfs triggered by a bad record can potentially add very heavy load making the server inaccessible. By default, HS2 should set these parameters to disallow local execution or send and error message if user tries to set these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4885) Alternative object serialization for execution plan in hive testing
[ https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721250#comment-13721250 ] Ashutosh Chauhan commented on HIVE-4885: [~appodictic] How did your tests go? Did you play with xstreams? If we pick binary (de)serializer, one option is to detect if we are running in test mode and if so, instead serialize using existing serialization mechanism, thus preserving existing test infra built for doing plan validations. Alternative object serialization for execution plan in hive testing Key: HIVE-4885 URL: https://issues.apache.org/jira/browse/HIVE-4885 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4885.patch Currently there are a lot of test cases involving in comparing execution plan, such as those in TestParse suite. XmlEncoder is used to serialize the generated plan by hive, and store it in the file for file diff comparison. However, XmlEncoder is tied with Java compiler, whose implementation may change from version to version. Thus, upgrade the compiler can generate a lot of fake test failures. The following is an example of diff generated when running hive with JDK7: {code} Begin query: case_sensitivity.q diff -a /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out diff -a -b /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml 3c3 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0 --- object id=MapRedTask0 class=org.apache.hadoop.hive.ql.exec.MapRedTask 12c12 object class=java.util.ArrayList id=ArrayList0 --- object id=ArrayList0 class=java.util.ArrayList 14c14 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask0 --- object id=MoveTask0 class=org.apache.hadoop.hive.ql.exec.MoveTask 18c18 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask1 --- object id=MoveTask1 class=org.apache.hadoop.hive.ql.exec.MoveTask 22c22 object class=org.apache.hadoop.hive.ql.exec.StatsTask id=StatsTask0 --- object id=StatsTask0 class=org.apache.hadoop.hive.ql.exec.StatsTask 60c60 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask1 --- object id=MapRedTask1 class=org.apache.hadoop.hive.ql.exec.MapRedTask {code} As it can be seen, the only difference is the order of the attributes in the serialized XML doc, yet it brings 50+ test failures in Hive. We need to have a better plan comparison, or object serialization to improve the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4899) Hive returns non-meanful error message for ill-formed fs.default.name
[ https://issues.apache.org/jira/browse/HIVE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721260#comment-13721260 ] Ashutosh Chauhan commented on HIVE-4899: +1 Hive returns non-meanful error message for ill-formed fs.default.name - Key: HIVE-4899 URL: https://issues.apache.org/jira/browse/HIVE-4899 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4899.patch For query in test case fs_default_name1.q: {code} set fs.default.name='http://www.example.com; show tables; {code} The following error message is returned: {code} FAILED: IllegalArgumentException null {code} The message is not very meaningful, and has null in it. It would be better if we can provide detailed error message. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4944) Hive Windows Scripts and Compatibility changes
Sushanth Sowmyan created HIVE-4944: -- Summary: Hive Windows Scripts and Compatibility changes Key: HIVE-4944 URL: https://issues.apache.org/jira/browse/HIVE-4944 Project: Hive Issue Type: Bug Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Porting patches that enable hive packaging and running under windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4944) Hive Windows Scripts and Compatibility changes
[ https://issues.apache.org/jira/browse/HIVE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721262#comment-13721262 ] Sushanth Sowmyan commented on HIVE-4944: Attaching 2 patches - one that is a umbrella patch of compatibility changes to enable hive building, testing and packaging under windows, and another that is a patch of scripts for installation, packaging and running. Hive Windows Scripts and Compatibility changes -- Key: HIVE-4944 URL: https://issues.apache.org/jira/browse/HIVE-4944 Project: Hive Issue Type: Bug Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: compat.patch, packaging.patch Porting patches that enable hive packaging and running under windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4944) Hive Windows Scripts and Compatibility changes
[ https://issues.apache.org/jira/browse/HIVE-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-4944: --- Attachment: packaging.patch compat.patch Hive Windows Scripts and Compatibility changes -- Key: HIVE-4944 URL: https://issues.apache.org/jira/browse/HIVE-4944 Project: Hive Issue Type: Bug Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: compat.patch, packaging.patch Porting patches that enable hive packaging and running under windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4885) Alternative object serialization for execution plan in hive testing
[ https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721269#comment-13721269 ] Edward Capriolo commented on HIVE-4885: --- it turns out xstream is already used in hcatalog somewhere. I did write some code to use it. there is one java based unit test taht passed but I did not have time for a performance evaluation, and to run the full test suite. I will probably do it over the next two days. Alternative object serialization for execution plan in hive testing Key: HIVE-4885 URL: https://issues.apache.org/jira/browse/HIVE-4885 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.12.0 Attachments: HIVE-4885.patch Currently there are a lot of test cases involving in comparing execution plan, such as those in TestParse suite. XmlEncoder is used to serialize the generated plan by hive, and store it in the file for file diff comparison. However, XmlEncoder is tied with Java compiler, whose implementation may change from version to version. Thus, upgrade the compiler can generate a lot of fake test failures. The following is an example of diff generated when running hive with JDK7: {code} Begin query: case_sensitivity.q diff -a /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out diff -a -b /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml 3c3 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask0 --- object id=MapRedTask0 class=org.apache.hadoop.hive.ql.exec.MapRedTask 12c12 object class=java.util.ArrayList id=ArrayList0 --- object id=ArrayList0 class=java.util.ArrayList 14c14 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask0 --- object id=MoveTask0 class=org.apache.hadoop.hive.ql.exec.MoveTask 18c18 object class=org.apache.hadoop.hive.ql.exec.MoveTask id=MoveTask1 --- object id=MoveTask1 class=org.apache.hadoop.hive.ql.exec.MoveTask 22c22 object class=org.apache.hadoop.hive.ql.exec.StatsTask id=StatsTask0 --- object id=StatsTask0 class=org.apache.hadoop.hive.ql.exec.StatsTask 60c60 object class=org.apache.hadoop.hive.ql.exec.MapRedTask id=MapRedTask1 --- object id=MapRedTask1 class=org.apache.hadoop.hive.ql.exec.MapRedTask {code} As it can be seen, the only difference is the order of the attributes in the serialized XML doc, yet it brings 50+ test failures in Hive. We need to have a better plan comparison, or object serialization to improve the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()
[ https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4823: -- Attachment: HIVE-4823.1-vectorization.patch implement vectorized TRIM(), LTRIM(), RTRIM() - Key: HIVE-4823 URL: https://issues.apache.org/jira/browse/HIVE-4823 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4823.1-vectorization.patch Make it work end-to-end, including the vectorized expression, and tying it together in VectorizationContext so a SQL query will run using vectorization when invoking these functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()
[ https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4823: -- Affects Version/s: vectorization-branch implement vectorized TRIM(), LTRIM(), RTRIM() - Key: HIVE-4823 URL: https://issues.apache.org/jira/browse/HIVE-4823 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4823.1-vectorization.patch Make it work end-to-end, including the vectorized expression, and tying it together in VectorizationContext so a SQL query will run using vectorization when invoking these functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()
[ https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4823: -- Fix Version/s: vectorization-branch Status: Patch Available (was: In Progress) implement vectorized TRIM(), LTRIM(), RTRIM() - Key: HIVE-4823 URL: https://issues.apache.org/jira/browse/HIVE-4823 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4823.1-vectorization.patch Make it work end-to-end, including the vectorized expression, and tying it together in VectorizationContext so a SQL query will run using vectorization when invoking these functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()
[ https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4823: -- Fix Version/s: (was: vectorization-branch) implement vectorized TRIM(), LTRIM(), RTRIM() - Key: HIVE-4823 URL: https://issues.apache.org/jira/browse/HIVE-4823 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4823.1-vectorization.patch Make it work end-to-end, including the vectorized expression, and tying it together in VectorizationContext so a SQL query will run using vectorization when invoking these functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4551) HCatLoader smallint/tinyint promotions to Int have issues with ORC integration
[ https://issues.apache.org/jira/browse/HIVE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721332#comment-13721332 ] Sushanth Sowmyan commented on HIVE-4551: Seems to succeed for me: -- checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 421 files BUILD SUCCESSFUL Total time: 1 minute 33 seconds -- Could you post up what error checkstyle brings up on your end? HCatLoader smallint/tinyint promotions to Int have issues with ORC integration -- Key: HIVE-4551 URL: https://issues.apache.org/jira/browse/HIVE-4551 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: 4551.patch This was initially reported from an e2e test run, with the following E2E test: {code} { 'name' = 'Hadoop_ORC_Write', 'tests' = [ { 'num' = 1 ,'hcat_prep'=q\ drop table if exists hadoop_orc; create table hadoop_orc ( t tinyint, si smallint, i int, b bigint, f float, d double, s string) stored as orc;\ ,'hadoop' = q\ jar :FUNCPATH:/testudf.jar org.apache.hcatalog.utils.WriteText -libjars :HCAT_JAR: :THRIFTSERVER: all100k hadoop_orc\, ,'result_table' = 'hadoop_orc' ,'sql' = q\select * from all100k;\ ,'floatpostprocess' = 1 ,'delimiter' = ' ' }, ], }, {code} This fails with the following error: {code} 2013-04-26 00:26:07,437 WARN org.apache.hadoop.mapred.Child: Error running child org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:53) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1195) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.getPrimitiveJavaObject(WritableIntObjectInspector.java:45) at org.apache.hcatalog.data.HCatRecordSerDe.serializePrimitiveField(HCatRecordSerDe.java:290) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:192) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) at org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63) ... 12 more 2013-04-26 00:26:07,440 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4823) implement vectorized TRIM(), LTRIM(), RTRIM()
[ https://issues.apache.org/jira/browse/HIVE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721337#comment-13721337 ] Eric Hanson commented on HIVE-4823: --- Depends on concat patch (HIVE-4512). implement vectorized TRIM(), LTRIM(), RTRIM() - Key: HIVE-4823 URL: https://issues.apache.org/jira/browse/HIVE-4823 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4823.1-vectorization.patch Make it work end-to-end, including the vectorized expression, and tying it together in VectorizationContext so a SQL query will run using vectorization when invoking these functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4929) the type of all numeric constants is changed to double in the plan
[ https://issues.apache.org/jira/browse/HIVE-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721342#comment-13721342 ] Hive QA commented on HIVE-4929: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594427/HIVE-4929.patch {color:green}SUCCESS:{color} +1 2653 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/198/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/198/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. the type of all numeric constants is changed to double in the plan -- Key: HIVE-4929 URL: https://issues.apache.org/jira/browse/HIVE-4929 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-4929.patch There's code which, after the numeric type for a constant in where clause has been chosen as the most restricted one or based on suffix, tries to change the type to match the numeric column which the constant is being compared with. However, due to a hack from HIVE-3059 every column type shows up as string in that code, causing it to always change the constant type to double. This should not be done (regardless of the hack). Spinoff from HIVE-2702, large number of query outputs change so it will be a big patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4945) Make RLIKE/REGEXP run end-to-end by updating VectorizationContext
Eric Hanson created HIVE-4945: - Summary: Make RLIKE/REGEXP run end-to-end by updating VectorizationContext Key: HIVE-4945 URL: https://issues.apache.org/jira/browse/HIVE-4945 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4551) HCatLoader smallint/tinyint promotions to Int have issues with ORC integration
[ https://issues.apache.org/jira/browse/HIVE-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4551: - Status: Patch Available (was: Open) HCatLoader smallint/tinyint promotions to Int have issues with ORC integration -- Key: HIVE-4551 URL: https://issues.apache.org/jira/browse/HIVE-4551 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: 4551.patch This was initially reported from an e2e test run, with the following E2E test: {code} { 'name' = 'Hadoop_ORC_Write', 'tests' = [ { 'num' = 1 ,'hcat_prep'=q\ drop table if exists hadoop_orc; create table hadoop_orc ( t tinyint, si smallint, i int, b bigint, f float, d double, s string) stored as orc;\ ,'hadoop' = q\ jar :FUNCPATH:/testudf.jar org.apache.hcatalog.utils.WriteText -libjars :HCAT_JAR: :THRIFTSERVER: all100k hadoop_orc\, ,'result_table' = 'hadoop_orc' ,'sql' = q\select * from all100k;\ ,'floatpostprocess' = 1 ,'delimiter' = ' ' }, ], }, {code} This fails with the following error: {code} 2013-04-26 00:26:07,437 WARN org.apache.hadoop.mapred.Child: Error running child org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:53) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1195) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.getPrimitiveJavaObject(WritableIntObjectInspector.java:45) at org.apache.hcatalog.data.HCatRecordSerDe.serializePrimitiveField(HCatRecordSerDe.java:290) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:192) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) at org.apache.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:63) ... 12 more 2013-04-26 00:26:07,440 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Discuss] project chop up
My mistake on saying hcat was a fork metastore. I had a brain fart for a moment. One way we could do this is create a folder called downstream. In our release step we can execute the downstream builds and then copy the files we need back. So nothing downstream will be on the classpath of the main project. This could help us breakup ql as well. Things like exotic file formats , and things that are pluggable like zk locking can go here. That might be overkill. For now we can focus on building downstream and hivethrift1might be the first thing to try to downstream. On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote: +1 to the idea of making the build of core hive and other downstream components independent. bq. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. The metastore code was never forked. Hcat was just using hive-metastore and making the metadata available to rest of hadoop (pig, java MR..). A lot of the changes that were driven by hcat goals were being made in hive-metastore. You can think of hcat as set of libraries that let pig and java MR use hive metastore. Since hcat is closely tied to hive-metastore, it makes sense to have them in same project. On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Also i believe hcatalog web can fall into the same designation. Question , hcatalog was initily a big hive-metastore fork. I was under the impression that Hcat and hive-metastore was supposed to merge up somehow. What is the status on that? I remember that was one of the core reasons we brought it in. On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote: I prefer option 3 as well. On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com wrote: On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I have been developing my laptop on a duel core 2 GB Ram laptop for years now. With the addition of hcatalog, hive-thrift2, and some other growth trying to develop hive in a eclipse on this machine craws, especially if 'build automatically' is turned on. As we look to add on more things this is only going to get worse. I am also noticing issues like this: https://issues.apache.org/jira/browse/HIVE-4849 What I think we should do is strip down/out optional parts of hive. 1) Hive Hbase This should really be it's own project to do this right we really have to have multiple branches since hbase is not backwards compatible. 2) Hive Web Interface Now really a big project but not really critical can be just as easily be build separately 3) hive thrift 1 We have hive thrift 2 now, it is time for the sun to set on hivethrift1, 4) odbc Not entirely convinced about this one but it is really not critical to running hive. What I think we should do is create sub-projects for the above things or simply move them into directories that do not build with hive. Ideally they would use maven to pull dependencies. What does everyone think? I agree that projects like the HBase handler and probably others as well should somehow be downstream projects which simply depend on the hive jars. I see a couple alternatives for this: * Take the module in question to the Apache Incubator * Move the module in question to the Apache Extras * Breakup the projects within our own source tree I'd prefer the third option at this point. Brock Brock
[jira] [Commented] (HIVE-4928) Date literals do not work properly in partition spec clause
[ https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721351#comment-13721351 ] Hive QA commented on HIVE-4928: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12594443/HIVE-4928.1.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/200/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/200/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-200/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1507508. At revision 1507508. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Date literals do not work properly in partition spec clause --- Key: HIVE-4928 URL: https://issues.apache.org/jira/browse/HIVE-4928 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4928.1.patch.txt The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource
[ https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4586: - Status: Patch Available (was: Open) Turns out the checkstyle problem is separate. Returning this to patch available. [HCatalog] WebHCat should return 404 error for undefined resource - Key: HIVE-4586 URL: https://issues.apache.org/jira/browse/HIVE-4586 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4586-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4928) Date literals do not work properly in partition spec clause
[ https://issues.apache.org/jira/browse/HIVE-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-4928: - Description: The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. NO PRECOMMIT TESTS was: The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. Date literals do not work properly in partition spec clause --- Key: HIVE-4928 URL: https://issues.apache.org/jira/browse/HIVE-4928 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-4928.1.patch.txt The partition spec parsing doesn't do any actual real evaluation of the values in the partition spec, instead just taking the text value of the ASTNode representing the partition value. This works fine for string/numeric literals (expression tree below): (TOK_PARTVAL region 99) But not for Date literals which are of form DATE '-mm-dd' (expression tree below: (TOK_DATELITERAL '1999-12-31') In this case the parser/analyzer uses TOK_DATELITERAL as the partition column value, when it should really get value of the child of the DATELITERAL token. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira