[jira] [Updated] (HIVE-4382) Fix offline build mode
[ https://issues.apache.org/jira/browse/HIVE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4382: - Attachment: HIVE-4382.1.patch Work in progress. Doesn't work with hcatalog yet (patch disables it) Fix offline build mode -- Key: HIVE-4382 URL: https://issues.apache.org/jira/browse/HIVE-4382 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Attachments: HIVE-4382.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4200) Consolidate submodule dependencies using ivy inheritance
[ https://issues.apache.org/jira/browse/HIVE-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4200: - Attachment: HIVE-4200.3.patch Consolidate submodule dependencies using ivy inheritance Key: HIVE-4200 URL: https://issues.apache.org/jira/browse/HIVE-4200 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4200.1.patch.txt, HIVE-4200.2.patch, HIVE-4200.3.patch As discussed in 4187: For easier maintenance of ivy dependencies across submodules: Create parent ivy file with consolidated dependencies and include into submodules via inheritance. This way we're not relying on transitive dependencies, but also have the dependencies in a single place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4095) Add exchange partition in Hive
[ https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-4095: - Status: Open (was: Patch Available) Add exchange partition in Hive -- Key: HIVE-4095 URL: https://issues.apache.org/jira/browse/HIVE-4095 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Dheeraj Kumar Singh Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4266) Refactor HCatalog code to org.apache.hive.hcatalog
[ https://issues.apache.org/jira/browse/HIVE-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636231#comment-13636231 ] Carl Steinbach commented on HIVE-4266: -- bq. We cannot make this kind of backwards incompatible change for users. Users will not see this as here, run this script against your source tree. They'll see it as they have to go modify, re-test, and re-deploy every application. Aren't these same users going to have to re-test and re-deploy every application when they bump the version number of their hcatalog dependency? bq. We should not make this a blocker for 0.11. I'm 90% of the way through the patch, but it will take a fair amount of testing when I'm done to asure that it works with both org.apache.hcatalog and org.apache.hive.hcatalog. I'm convinced that if we don't do this now it's never going to happen, which is why I think one of the exit criteria for 0.11.0 needs to be either a) providing wrappers and a clearly stated EOL timeline for the org.apache.hcatalog namespace, or b) changing the package names only. Refactor HCatalog code to org.apache.hive.hcatalog -- Key: HIVE-4266 URL: https://issues.apache.org/jira/browse/HIVE-4266 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.11.0 Currently HCatalog code is in packages org.apache.hcatalog. It needs to now move to org.apache.hive.hcatalog. Shell classes/interface need to be created for public facing classes so that user's code does not break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions
[ https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636294#comment-13636294 ] Phabricator commented on HIVE-3509: --- njain has commented on the revision HIVE-3509 [jira] Exclusive locks are not acquired when using dynamic partitions. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java:144 This is a incompatible change, and may break many existing apps. For eg: in FB we log the query along with inputs and outputs, and this will leave the burden on the client to change / to @ appropriately. Although it is not ideal, but let us stick with the format: db@table@partns. where partitions is of partitionCol1/partitionCol2 REVISION DETAIL https://reviews.facebook.net/D10065 To: JIRA, MattMartin Cc: njain Exclusive locks are not acquired when using dynamic partitions -- Key: HIVE-3509 URL: https://issues.apache.org/jira/browse/HIVE-3509 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Matt Martin Assignee: Matt Martin Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch If locking is enabled, the acquireReadWriteLocks() method in org.apache.hadoop.hive.ql.Driver iterates through all of the input and output entities of the query plan and attempts to acquire the appropriate locks. In general, it should acquire SHARED locks for all of the input entities and exclusive locks for all of the output entities (see the Hive wiki page on [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more detailed information). When the query involves dynamic partitions, the situation is a little more subtle. As the Hive wiki notes (see previous link): {quote} in some cases, the list of objects may not be known - for eg. in case of dynamic partitions, the list of partitions being modified is not known at compile time - so, the list is generated conservatively. Since the number of partitions may not be known, an exclusive lock is taken on the table, or the prefix that is known. {quote} After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the observed behavior is no longer consistent with the behavior described above. [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have altered the logic so that SHARED locks are acquired instead of EXCLUSIVE locks whenever the query involves dynamic partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions
[ https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636297#comment-13636297 ] Namit Jain commented on HIVE-3509: -- comments Exclusive locks are not acquired when using dynamic partitions -- Key: HIVE-3509 URL: https://issues.apache.org/jira/browse/HIVE-3509 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Matt Martin Assignee: Matt Martin Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch If locking is enabled, the acquireReadWriteLocks() method in org.apache.hadoop.hive.ql.Driver iterates through all of the input and output entities of the query plan and attempts to acquire the appropriate locks. In general, it should acquire SHARED locks for all of the input entities and exclusive locks for all of the output entities (see the Hive wiki page on [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more detailed information). When the query involves dynamic partitions, the situation is a little more subtle. As the Hive wiki notes (see previous link): {quote} in some cases, the list of objects may not be known - for eg. in case of dynamic partitions, the list of partitions being modified is not known at compile time - so, the list is generated conservatively. Since the number of partitions may not be known, an exclusive lock is taken on the table, or the prefix that is known. {quote} After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the observed behavior is no longer consistent with the behavior described above. [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have altered the logic so that SHARED locks are acquired instead of EXCLUSIVE locks whenever the query involves dynamic partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4095) Add exchange partition in Hive
[ https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636376#comment-13636376 ] Dheeraj Kumar Singh commented on HIVE-4095: --- @Namit: Did you patch both the files here? Add exchange partition in Hive -- Key: HIVE-4095 URL: https://issues.apache.org/jira/browse/HIVE-4095 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Dheeraj Kumar Singh Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4356 - remove duplicate impersonation parameters for hiveserver2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10554/#review19455 --- Ship it! +1 - Ashutosh Chauhan On April 16, 2013, 9:46 p.m., Thejas Nair wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10554/ --- (Updated April 16, 2013, 9:46 p.m.) Review request for hive. Description --- remove duplicate impersonation parameters for hiveserver2 This addresses bug HIVE-4356. https://issues.apache.org/jira/browse/HIVE-4356 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 78d9cc9 conf/hive-default.xml.template e266ce7 service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java 18d4aae service/src/java/org/apache/hive/service/cli/CLIService.java b53599b service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 43d79aa service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java PRE-CREATION service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java PRE-CREATION Diff: https://reviews.apache.org/r/10554/diff/ Testing --- Unit tests included. Manually tested on (kerberos) secure and unsecure cluster. Thanks, Thejas Nair
[jira] [Commented] (HIVE-4356) remove duplicate impersonation parameters for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636430#comment-13636430 ] Ashutosh Chauhan commented on HIVE-4356: +1 remove duplicate impersonation parameters for hiveserver2 - Key: HIVE-4356 URL: https://issues.apache.org/jira/browse/HIVE-4356 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.11.0 Attachments: HIVE-4356.1.patch There are two parameters controlling impersonation in hiveserver2. hive.server2.enable.doAs that controls this in kerberos secure mode, while hive.server2.enable.doAs controls this for unsecure mode. We should have just one for both modes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4106) SMB joins fail in multi-way joins
[ https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636443#comment-13636443 ] Ashutosh Chauhan commented on HIVE-4106: Isn't HIVE-4371 a proper fix for it? Does the test-case still fails after applying HIVE-4371? SMB joins fail in multi-way joins - Key: HIVE-4106 URL: https://issues.apache.org/jira/browse/HIVE-4106 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: auto_sortmerge_join_12.q, hive.4106.1.patch, hive.4106.2.patch, HIVE-4106.patch I see array out of bounds exception in case of multi way smb joins. This is related to changes that went in as part of HIVE-3403. This issue has been discussed in HIVE-3891. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636446#comment-13636446 ] Ashutosh Chauhan commented on HIVE-4333: [~rhbutani] Can you create phabricator entry for this? Since its a huge patch, its hard to read diff file. most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636456#comment-13636456 ] Ashutosh Chauhan commented on HIVE-4333: bq. There are diffs because of precision. Some of the avg and sum functions are now wrapped in 'round' I didn't get this part. All this computation is within Hive, it shouldn't be affected by hadoop version. wrapped in 'round' ? in Hive or Hadoop? bq. Looks like the shuffle in 2.0 reorders the rows even in this case. Yeah thats possible. Since in over() partitioning is by constant so all rows have same value for partitioning column so they can arrive in any order. We need to come up with clever way of writing test which still test over() but gives ordered result for both hadoop 1 and hadoop2 most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4333: --- Affects Version/s: 0.11.0 most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4333: --- Component/s: PTF-Windowing most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.11.0 Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4304) Remove unused builtins and pdk submodules
[ https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Travis Crawford updated HIVE-4304: -- Attachment: HIVE-4304.patch Remove unused builtins and pdk submodules - Key: HIVE-4304 URL: https://issues.apache.org/jira/browse/HIVE-4304 Project: Hive Issue Type: Improvement Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-4304.1.patch, HIVE-4304.patch Moving from email. The [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not believed to be in use and should be removed. The main benefits are simplification and maintainability of the Hive code base. Forwarded conversation Subject: builtins submodule - is it still needed? From: Travis Crawford traviscrawf...@gmail.com Date: Thu, Apr 4, 2013 at 2:01 PM To: u...@hive.apache.org, dev@hive.apache.org Hey hive gurus - Is the builtins hive submodule in use? The submodule was added in HIVE-2523 as a location for builtin-UDFs, but it appears to not have taken off. Any objections to removing it? DETAILS For HIVE-4278 I'm making some build changes for the HCatalog integration. The builtins submodule causes issues because it delays building until the packaging phase - so HCatalog can't depend on builtins, which it does transitively. While investigating a path forward I discovered the builtins submodule contains very little code, and likely could either go away entirely or merge into ql, simplifying things both for users and developers. Thoughts? Can anyone with context help me understand builtins, both in general and around its non-standard build? For your trouble I'll either make the submodule go away/merge into another submodule, or update the docs with what we learn. Thanks! Travis -- From: Ashutosh Chauhan ashutosh.chau...@gmail.com Date: Fri, Apr 5, 2013 at 3:10 PM To: dev@hive.apache.org Cc: u...@hive.apache.org u...@hive.apache.org I haven't used it myself anytime till now. Neither have met anyone who used it or plan to use it. Ashutosh On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Gunther Hagleitner ghagleit...@hortonworks.com Date: Fri, Apr 5, 2013 at 3:11 PM To: dev@hive.apache.org Cc: u...@hive.apache.org +1 I would actually go a step further and propose to remove both PDK and builtins. I've went through the code for both and here is what I found: Builtins: - BuiltInUtils.java: Empty file - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was intended as a building block for PDK PDK: - some helper build.xml/test setup + teardown scripts - Classes/annotations to help run unit tests - rot13 as an example From what I can tell it's a fair assessment that it hasn't taken off, last commits to it seem to have happened more than 1.5 years ago. Thanks, Gunther. On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Owen O'Malley omal...@apache.org Date: Fri, Apr 5, 2013 at 4:45 PM To: u...@hive.apache.org +1 to removing them. We have a Rot13 example in ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java anyways. *smile* -- Owen -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4304) Remove unused builtins and pdk submodules
[ https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Travis Crawford updated HIVE-4304: -- Status: Patch Available (was: Open) Remove unused builtins and pdk submodules - Key: HIVE-4304 URL: https://issues.apache.org/jira/browse/HIVE-4304 Project: Hive Issue Type: Improvement Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-4304.1.patch, HIVE-4304.patch Moving from email. The [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not believed to be in use and should be removed. The main benefits are simplification and maintainability of the Hive code base. Forwarded conversation Subject: builtins submodule - is it still needed? From: Travis Crawford traviscrawf...@gmail.com Date: Thu, Apr 4, 2013 at 2:01 PM To: u...@hive.apache.org, dev@hive.apache.org Hey hive gurus - Is the builtins hive submodule in use? The submodule was added in HIVE-2523 as a location for builtin-UDFs, but it appears to not have taken off. Any objections to removing it? DETAILS For HIVE-4278 I'm making some build changes for the HCatalog integration. The builtins submodule causes issues because it delays building until the packaging phase - so HCatalog can't depend on builtins, which it does transitively. While investigating a path forward I discovered the builtins submodule contains very little code, and likely could either go away entirely or merge into ql, simplifying things both for users and developers. Thoughts? Can anyone with context help me understand builtins, both in general and around its non-standard build? For your trouble I'll either make the submodule go away/merge into another submodule, or update the docs with what we learn. Thanks! Travis -- From: Ashutosh Chauhan ashutosh.chau...@gmail.com Date: Fri, Apr 5, 2013 at 3:10 PM To: dev@hive.apache.org Cc: u...@hive.apache.org u...@hive.apache.org I haven't used it myself anytime till now. Neither have met anyone who used it or plan to use it. Ashutosh On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Gunther Hagleitner ghagleit...@hortonworks.com Date: Fri, Apr 5, 2013 at 3:11 PM To: dev@hive.apache.org Cc: u...@hive.apache.org +1 I would actually go a step further and propose to remove both PDK and builtins. I've went through the code for both and here is what I found: Builtins: - BuiltInUtils.java: Empty file - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was intended as a building block for PDK PDK: - some helper build.xml/test setup + teardown scripts - Classes/annotations to help run unit tests - rot13 as an example From what I can tell it's a fair assessment that it hasn't taken off, last commits to it seem to have happened more than 1.5 years ago. Thanks, Gunther. On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Owen O'Malley omal...@apache.org Date: Fri, Apr 5, 2013 at 4:45 PM To: u...@hive.apache.org +1 to removing them. We have a Rot13 example in ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java anyways. *smile* -- Owen -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4095) Add exchange partition in Hive
[ https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636497#comment-13636497 ] Dheeraj Kumar Singh commented on HIVE-4095: --- [~namit]: The revision 10035 does not include the thrift generated changes. Phabricator won't allow me to upload the thrift generated changes as they are quite large. I've included these in the patch HIVE-4095.part12.patch.txt uploaded here. Add exchange partition in Hive -- Key: HIVE-4095 URL: https://issues.apache.org/jira/browse/HIVE-4095 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Dheeraj Kumar Singh Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4178) ORC fails with files with different numbers of columns
[ https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4178: Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) I just committed this to trunk and branch-11. Thanks, Kevin! ORC fails with files with different numbers of columns -- Key: HIVE-4178 URL: https://issues.apache.org/jira/browse/HIVE-4178 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4178.1.patch.txt When CombineHiveInputFormat is used, it's possible that two files with different numbers of files can be included in the same split, in which case Hive will fail at one of several points with an ArrayIndexOutOfBoundsException. This can happen when a partition contains empty files or two partitions are read with different numbers of columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4305) Use a single system for dependency resolution
[ https://issues.apache.org/jira/browse/HIVE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636534#comment-13636534 ] Owen O'Malley commented on HIVE-4305: - Carl, Rather than debate it theoretically or compare it to Hadoop, which has a *LOT* more complexity in their build, I propose that we have Travis make a Maven build file for the combined Hive and HCat systems. Then we can debate the value and issues in the particular patch and how to move the project forward. The current state is painful with extremely long builds. We need to move forward and enable the project to evolve quickly so that Hive can compete with its many comercial competitors. Use a single system for dependency resolution - Key: HIVE-4305 URL: https://issues.apache.org/jira/browse/HIVE-4305 Project: Hive Issue Type: Improvement Components: Build Infrastructure, HCatalog Reporter: Travis Crawford Assignee: Carl Steinbach Both Hive and HCatalog use ant as their build tool. However, Hive uses ivy for dependency resolution while HCatalog uses maven-ant-tasks. With the project merge we should converge on a single tool for dependency resolution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4333: -- Attachment: HIVE-4333.D10389.1.patch hbutani requested code review of HIVE-4333 [jira] most windowing tests fail on hadoop 2. Reviewers: JIRA, ashutoshc fix tests for hadoop 2 Problem is different order of results on hadoop 2 TEST PLAN change existing tests REVISION DETAIL https://reviews.facebook.net/D10389 AFFECTED FILES data/files/flights_tiny.txt data/files/part.rc data/files/part.seq ql/src/test/queries/clientpositive/leadlag.q ql/src/test/queries/clientpositive/ptf.q ql/src/test/queries/clientpositive/ptf_general_queries.q ql/src/test/queries/clientpositive/windowing.q ql/src/test/queries/clientpositive/windowing_expressions.q ql/src/test/queries/clientpositive/windowing_multipartitioning.q ql/src/test/queries/clientpositive/windowing_navfn.q ql/src/test/queries/clientpositive/windowing_ntile.q ql/src/test/queries/clientpositive/windowing_rank.q ql/src/test/queries/clientpositive/windowing_udaf.q ql/src/test/queries/clientpositive/windowing_windowspec.q ql/src/test/results/clientpositive/leadlag.q.out ql/src/test/results/clientpositive/ptf.q.out ql/src/test/results/clientpositive/ptf_general_queries.q.out ql/src/test/results/clientpositive/windowing.q.out ql/src/test/results/clientpositive/windowing_expressions.q.out ql/src/test/results/clientpositive/windowing_multipartitioning.q.out ql/src/test/results/clientpositive/windowing_navfn.q.out ql/src/test/results/clientpositive/windowing_ntile.q.out ql/src/test/results/clientpositive/windowing_rank.q.out ql/src/test/results/clientpositive/windowing_udaf.q.out ql/src/test/results/clientpositive/windowing_windowspec.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/24867/ To: JIRA, ashutoshc, hbutani most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.11.0 Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636548#comment-13636548 ] Harish Butani commented on HIVE-4333: - I think the diffs due to precision are for the same ordering issue. Since the rows in the partitions are not in the same order there are differences in the overall sum/avg beyond 2 decimal places. most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.11.0 Reporter: Gunther Hagleitner Assignee: Matthew Weaver Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions
[ https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636551#comment-13636551 ] Phabricator commented on HIVE-3509: --- MattMartin has commented on the revision HIVE-3509 [jira] Exclusive locks are not acquired when using dynamic partitions. For the record, I'm planning to roll back the major change in my last revision which acquires and releases the whole hierarchy of locks on explicit lock ... and unlock INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java:144 This change should only affect locks. In particular, this would make sure the lock paths are consistent for dummy partitions and non-dummy partitions. Without this change, I think a case could arise where a write query with dynamic partitions tries to acquire an exclusive lock on base dir in zookeeper/db@table@partns while a read query simultaneously tries to acquire a shared lock on base locking dir in zookeeper/db/table/partns. In this case the reader and writer would not block each other even though they should. I'll try to add a test case to illustrate this point. REVISION DETAIL https://reviews.facebook.net/D10065 To: JIRA, MattMartin Cc: njain Exclusive locks are not acquired when using dynamic partitions -- Key: HIVE-3509 URL: https://issues.apache.org/jira/browse/HIVE-3509 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.9.0 Reporter: Matt Martin Assignee: Matt Martin Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch If locking is enabled, the acquireReadWriteLocks() method in org.apache.hadoop.hive.ql.Driver iterates through all of the input and output entities of the query plan and attempts to acquire the appropriate locks. In general, it should acquire SHARED locks for all of the input entities and exclusive locks for all of the output entities (see the Hive wiki page on [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more detailed information). When the query involves dynamic partitions, the situation is a little more subtle. As the Hive wiki notes (see previous link): {quote} in some cases, the list of objects may not be known - for eg. in case of dynamic partitions, the list of partitions being modified is not known at compile time - so, the list is generated conservatively. Since the number of partitions may not be known, an exclusive lock is taken on the table, or the prefix that is known. {quote} After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the observed behavior is no longer consistent with the behavior described above. [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have altered the logic so that SHARED locks are acquired instead of EXCLUSIVE locks whenever the query involves dynamic partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4304) Remove unused builtins and pdk submodules
[ https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636562#comment-13636562 ] Ashutosh Chauhan commented on HIVE-4304: +1 will commit if tests pass Remove unused builtins and pdk submodules - Key: HIVE-4304 URL: https://issues.apache.org/jira/browse/HIVE-4304 Project: Hive Issue Type: Improvement Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-4304.1.patch, HIVE-4304.patch Moving from email. The [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not believed to be in use and should be removed. The main benefits are simplification and maintainability of the Hive code base. Forwarded conversation Subject: builtins submodule - is it still needed? From: Travis Crawford traviscrawf...@gmail.com Date: Thu, Apr 4, 2013 at 2:01 PM To: u...@hive.apache.org, dev@hive.apache.org Hey hive gurus - Is the builtins hive submodule in use? The submodule was added in HIVE-2523 as a location for builtin-UDFs, but it appears to not have taken off. Any objections to removing it? DETAILS For HIVE-4278 I'm making some build changes for the HCatalog integration. The builtins submodule causes issues because it delays building until the packaging phase - so HCatalog can't depend on builtins, which it does transitively. While investigating a path forward I discovered the builtins submodule contains very little code, and likely could either go away entirely or merge into ql, simplifying things both for users and developers. Thoughts? Can anyone with context help me understand builtins, both in general and around its non-standard build? For your trouble I'll either make the submodule go away/merge into another submodule, or update the docs with what we learn. Thanks! Travis -- From: Ashutosh Chauhan ashutosh.chau...@gmail.com Date: Fri, Apr 5, 2013 at 3:10 PM To: dev@hive.apache.org Cc: u...@hive.apache.org u...@hive.apache.org I haven't used it myself anytime till now. Neither have met anyone who used it or plan to use it. Ashutosh On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Gunther Hagleitner ghagleit...@hortonworks.com Date: Fri, Apr 5, 2013 at 3:11 PM To: dev@hive.apache.org Cc: u...@hive.apache.org +1 I would actually go a step further and propose to remove both PDK and builtins. I've went through the code for both and here is what I found: Builtins: - BuiltInUtils.java: Empty file - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was intended as a building block for PDK PDK: - some helper build.xml/test setup + teardown scripts - Classes/annotations to help run unit tests - rot13 as an example From what I can tell it's a fair assessment that it hasn't taken off, last commits to it seem to have happened more than 1.5 years ago. Thanks, Gunther. On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford traviscrawf...@gmail.comwrote: -- From: Owen O'Malley omal...@apache.org Date: Fri, Apr 5, 2013 at 4:45 PM To: u...@hive.apache.org +1 to removing them. We have a Rot13 example in ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java anyways. *smile* -- Owen -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4380) Implement Vectorized Scalar-Column expressions
[ https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4380: -- Status: Patch Available (was: Open) To be applied to vectorization branch Implement Vectorized Scalar-Column expressions -- Key: HIVE-4380 URL: https://issues.apache.org/jira/browse/HIVE-4380 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Eric Hanson The expressions with scalar as the first operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4380) Implement Vectorized Scalar-Column expressions
[ https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4380: -- Attachment: HIVE-4380.1.patch Implement Vectorized Scalar-Column expressions -- Key: HIVE-4380 URL: https://issues.apache.org/jira/browse/HIVE-4380 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Eric Hanson Attachments: HIVE-4380.1.patch The expressions with scalar as the first operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4340) ORC should provide raw data size
[ https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636677#comment-13636677 ] Kevin Wilfong commented on HIVE-4340: - https://reviews.facebook.net/D10179 ORC should provide raw data size Key: HIVE-4340 URL: https://issues.apache.org/jira/browse/HIVE-4340 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4340) ORC should provide raw data size
[ https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-4340: Attachment: HIVE-4340.1.patch.txt ORC should provide raw data size Key: HIVE-4340 URL: https://issues.apache.org/jira/browse/HIVE-4340 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4340.1.patch.txt ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4340) ORC should provide raw data size
[ https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-4340: Status: Patch Available (was: Open) ORC should provide raw data size Key: HIVE-4340 URL: https://issues.apache.org/jira/browse/HIVE-4340 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4340.1.patch.txt ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4380) Implement Vectorized Scalar-Column expressions
[ https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636678#comment-13636678 ] Eric Hanson commented on HIVE-4380: --- This patch depends on the patch for https://issues.apache.org/jira/browse/HIVE-4282. After that patch gets committed, I will create a ReviewBoard entry. Currently I can't do that because when I try, I get this error: The file 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java' (rd6f45d3) could not be found in the repository Implement Vectorized Scalar-Column expressions -- Key: HIVE-4380 URL: https://issues.apache.org/jira/browse/HIVE-4380 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Eric Hanson Attachments: HIVE-4380.1.patch The expressions with scalar as the first operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4383) Implement vectorized string column-scalar filters
Eric Hanson created HIVE-4383: - Summary: Implement vectorized string column-scalar filters Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-4383: - Assignee: Eric Hanson Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4384) Implement vectorized string functions UPPER() and LOWER()
Eric Hanson created HIVE-4384: - Summary: Implement vectorized string functions UPPER() and LOWER() Key: HIVE-4384 URL: https://issues.apache.org/jira/browse/HIVE-4384 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4385) Implement vectorized LIKE filter
Eric Hanson created HIVE-4385: - Summary: Implement vectorized LIKE filter Key: HIVE-4385 URL: https://issues.apache.org/jira/browse/HIVE-4385 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4386) max() and min() return NULL on partition column; distinct() returns nothing
Robin Morris created HIVE-4386: -- Summary: max() and min() return NULL on partition column; distinct() returns nothing Key: HIVE-4386 URL: https://issues.apache.org/jira/browse/HIVE-4386 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.8.1 Reporter: Robin Morris partitioned_table is partitioned on year, month, day. select max(day) from partitioned_table where year=2013 and month=4; spins up zero mappers, one reducer, and returns NULL. Same for select min(day) from ... select distinct(day) from... returns nothing at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4386) max() and min() return NULL on partition column; distinct() returns nothing
[ https://issues.apache.org/jira/browse/HIVE-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Morris updated HIVE-4386: --- Description: partitioned_table is partitioned on year, month, day. select max(day) from partitioned_table where year=2013 and month=4; spins up zero mappers, one reducer, and returns NULL. Same for select min(day) from ... select distinct(day) from... returns nothing at all. Using an explicit intermediate table does work: create table foo_max as select day from partitioned_table where year=2013 and month=4; select max(day) from foo_max; drop table foo_max; Several map-reduce jobs later, the correct answer is given. was: partitioned_table is partitioned on year, month, day. select max(day) from partitioned_table where year=2013 and month=4; spins up zero mappers, one reducer, and returns NULL. Same for select min(day) from ... select distinct(day) from... returns nothing at all. max() and min() return NULL on partition column; distinct() returns nothing --- Key: HIVE-4386 URL: https://issues.apache.org/jira/browse/HIVE-4386 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.8.1 Reporter: Robin Morris partitioned_table is partitioned on year, month, day. select max(day) from partitioned_table where year=2013 and month=4; spins up zero mappers, one reducer, and returns NULL. Same for select min(day) from ... select distinct(day) from... returns nothing at all. Using an explicit intermediate table does work: create table foo_max as select day from partitioned_table where year=2013 and month=4; select max(day) from foo_max; drop table foo_max; Several map-reduce jobs later, the correct answer is given. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2055) Hive HBase Integration issue
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636834#comment-13636834 ] Sushanth Sowmyan commented on HIVE-2055: This works for me. Non-binding +1. Hive HBase Integration issue Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: sajith v Attachments: HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4282) Implement vectorized column-scalar expressions
[ https://issues.apache.org/jira/browse/HIVE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-4282: --- Attachment: HIVE-4282.3.patch The binary files were by accident, removed in the latest patch. Also uploaded on the review board. Implement vectorized column-scalar expressions -- Key: HIVE-4282 URL: https://issues.apache.org/jira/browse/HIVE-4282 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-4282.1.patch, HIVE-4282.2.patch, HIVE-4282.3.patch Implement arithmetic expressions involving a column and a scalar with column as first argument. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2055) Hive HBase Integration issue
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636889#comment-13636889 ] Sushanth Sowmyan commented on HIVE-2055: As Nick notes in HCATALOG-621, there might be more to this - I only tested for ddl operations. That said, setting HIVE_AUX_JARS_PATH should work for this, right? Hive HBase Integration issue Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: sajith v Attachments: HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4333) most windowing tests fail on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Weaver reassigned HIVE-4333: Assignee: Harish Butani (was: Matthew Weaver) most windowing tests fail on hadoop 2 - Key: HIVE-4333 URL: https://issues.apache.org/jira/browse/HIVE-4333 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.11.0 Reporter: Gunther Hagleitner Assignee: Harish Butani Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch Problem is different order of results on hadoop 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4248) Implement a memory manager for ORC
[ https://issues.apache.org/jira/browse/HIVE-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636917#comment-13636917 ] Phabricator commented on HIVE-4248: --- ashutoshc has accepted the revision HIVE-4248 [jira] Implement a memory manager for ORC. +1 will commit if tests pass. REVISION DETAIL https://reviews.facebook.net/D9993 BRANCH h-4248 ARCANIST PROJECT hive To: JIRA, ashutoshc, omalley Cc: kevinwilfong Implement a memory manager for ORC -- Key: HIVE-4248 URL: https://issues.apache.org/jira/browse/HIVE-4248 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4248.D9993.1.patch, HIVE-4248.D9993.2.patch, HIVE-4248.D9993.4.patch With the large default stripe size (256MB) and dynamic partitions, it is quite easy for users to run out of memory when writing ORC files. We probably need a solution that keeps track of the total number of concurrent ORC writers and divides the available heap space between them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4266) Refactor HCatalog code to org.apache.hive.hcatalog
[ https://issues.apache.org/jira/browse/HIVE-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636965#comment-13636965 ] Alan Gates commented on HIVE-4266: -- bq. Aren't these same users going to have to re-test and re-deploy every application when they bump the version number of their hcatalog dependency? In my experience when you install a new Hadoop, Hive, or other tool version it is usually placed on a test/dev cluster for a while and users are given a chance to run on it, and once that proves out it is promoted to the production cluster(s). There isn't usually a step to validate every application and obviously no need to re-deploy applications. In this scenario users can retest on their schedule and decide which applications are not crucial enough to warrant the effort. On the other hand if you tell users, This is not backward compatible, you have to rewrite all your programs and scripts you are forcing them to rewrite, retest, and re-deploy everything before they can deploy the new version. This puts a big barrier to uptake of the new version in their way. I am fine with setting a sunset for these shell classes. Two major releases (ie Hive 0.13 or 0.14 depending on which release they go out with). bq. I'm convinced that if we don't do this now it's never going to happen... Since Ashutosh is managing this release I'll defer to him, but I am concerned about stuffing something this large in at the last minute. I understand that in software deferred is a synonym for when hell freezes over, but I honestly have the patch mostly done. The unit tests are passing. I don't have the javadoc doing the right thing yet and I need to run the system tests against both org.apache.hcatalog and org.apache.hive.hcatalog, which I'm estimating will take me a few days assuming I find a few bugs. Refactor HCatalog code to org.apache.hive.hcatalog -- Key: HIVE-4266 URL: https://issues.apache.org/jira/browse/HIVE-4266 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.11.0 Currently HCatalog code is in packages org.apache.hcatalog. It needs to now move to org.apache.hive.hcatalog. Shell classes/interface need to be created for public facing classes so that user's code does not break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls
[ https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4189: Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) I just committed this to trunk and branch-0.11. Thanks, Kevin! ORC fails with String column that ends in lots of nulls --- Key: HIVE-4189 URL: https://issues.apache.org/jira/browse/HIVE-4189 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4189.1.patch.txt, HIVE-4189.2.patch.txt When ORC attempts to write out a string column that ends in enough nulls to span an index stride, StringTreeWriter's writeStripe method will get an exception from TreeWriter's writeStripe method Column has wrong number of index entries found: x expected: y This is caused by rowIndexValueCount having multiple entries equal to the number of non-null rows in the column, combined with the fact that StringTreeWriter has special logic for constructing its index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4365) wrong result in left semi join
[ https://issues.apache.org/jira/browse/HIVE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4365: Status: Patch Available (was: Open) wrong result in left semi join -- Key: HIVE-4365 URL: https://issues.apache.org/jira/browse/HIVE-4365 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0, 0.9.0 Reporter: ransom.hezhiqiang Assignee: Navis Attachments: HIVE-4365.D10341.1.patch, HIVE-4365.D10341.2.patch wrong result in left semi join while hive.optimize.ppd=true for example: 1、create table create table t1(c1 int,c2 int, c3 int, c4 int, c5 double,c6 int,c7 string) row format DELIMITED FIELDS TERMINATED BY '|'; create table t2(c1 int) ; 2、load data load data local inpath '/home/test/t1.txt' OVERWRITE into table t1; load data local inpath '/home/test/t2.txt' OVERWRITE into table t2; t1 data: 1|3|10003|52|781.96|555|201203 1|3|10003|39|782.96|555|201203 1|3|10003|87|783.96|555|201203 2|5|10004|24|789.96|555|201203 2|5|10004|58|788.96|555|201203 t2 data: 555 3、excute Query select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7 from t1 left semi join t2 on t1.c6 = t2.c1 and t1.c1 = '1' and t1.c7 = '201203' ; can got result. select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7 from t1 left semi join t2 on t1.c6 = t2.c1 where t1.c1 = '1' and t1.c7 = '201203' ; can't got result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4365) wrong result in left semi join
[ https://issues.apache.org/jira/browse/HIVE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4365: -- Attachment: HIVE-4365.D10341.2.patch navis updated the revision HIVE-4365 [jira] wrong result in left semi join. Fixed test result passed all tests Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D10341 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10341?vs=32361id=32505#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java ql/src/test/queries/clientpositive/semijoin.q ql/src/test/results/clientpositive/semijoin.q.out ql/src/test/results/compiler/plan/join1.q.xml ql/src/test/results/compiler/plan/join2.q.xml ql/src/test/results/compiler/plan/join3.q.xml ql/src/test/results/compiler/plan/join4.q.xml ql/src/test/results/compiler/plan/join5.q.xml ql/src/test/results/compiler/plan/join6.q.xml ql/src/test/results/compiler/plan/join7.q.xml ql/src/test/results/compiler/plan/join8.q.xml To: JIRA, navis wrong result in left semi join -- Key: HIVE-4365 URL: https://issues.apache.org/jira/browse/HIVE-4365 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0 Reporter: ransom.hezhiqiang Assignee: Navis Attachments: HIVE-4365.D10341.1.patch, HIVE-4365.D10341.2.patch wrong result in left semi join while hive.optimize.ppd=true for example: 1、create table create table t1(c1 int,c2 int, c3 int, c4 int, c5 double,c6 int,c7 string) row format DELIMITED FIELDS TERMINATED BY '|'; create table t2(c1 int) ; 2、load data load data local inpath '/home/test/t1.txt' OVERWRITE into table t1; load data local inpath '/home/test/t2.txt' OVERWRITE into table t2; t1 data: 1|3|10003|52|781.96|555|201203 1|3|10003|39|782.96|555|201203 1|3|10003|87|783.96|555|201203 2|5|10004|24|789.96|555|201203 2|5|10004|58|788.96|555|201203 t2 data: 555 3、excute Query select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7 from t1 left semi join t2 on t1.c6 = t2.c1 and t1.c1 = '1' and t1.c7 = '201203' ; can got result. select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7 from t1 left semi join t2 on t1.c6 = t2.c1 where t1.c1 = '1' and t1.c7 = '201203' ; can't got result. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-4342: --- Assignee: Navis NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical Attachments: example.txt UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attached file (example.txt) contains the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4387) ant make-pom fails because hcatalog doesn't have a make-pom target
Alan Gates created HIVE-4387: Summary: ant make-pom fails because hcatalog doesn't have a make-pom target Key: HIVE-4387 URL: https://issues.apache.org/jira/browse/HIVE-4387 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.11.0 Other *-pom directives probably fail as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4387) ant maven-build fails because hcatalog doesn't have a make-pom target
[ https://issues.apache.org/jira/browse/HIVE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4387: - Summary: ant maven-build fails because hcatalog doesn't have a make-pom target (was: ant make-pom fails because hcatalog doesn't have a make-pom target) ant maven-build fails because hcatalog doesn't have a make-pom target - Key: HIVE-4387 URL: https://issues.apache.org/jira/browse/HIVE-4387 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.11.0 Other *-pom directives probably fail as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4387) ant maven-build fails because hcatalog doesn't have a make-pom target
[ https://issues.apache.org/jira/browse/HIVE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4387: - Description: Other maven-* target may fail as well. (was: Other *-pom directives probably fail as well.) ant maven-build fails because hcatalog doesn't have a make-pom target - Key: HIVE-4387 URL: https://issues.apache.org/jira/browse/HIVE-4387 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.11.0 Other maven-* target may fail as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mihir Kulkarni updated HIVE-4342: - Description: UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. was: UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attached file (example.txt) contains the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mihir Kulkarni updated HIVE-4342: - Attachment: (was: example.txt) NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mihir Kulkarni updated HIVE-4342: - Attachment: Query.txt sourceData2.txt sourceData1.txt HiveCommands.txt NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, sourceData2.txt UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637023#comment-13637023 ] Mihir Kulkarni commented on HIVE-4342: -- [~navis] I have updated the attachments which contain the command script to generate schema, the data to be used with the command script and the exact query! I hope you are able to reproduce the NPE with this information. NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, sourceData2.txt UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL
[ https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mihir Kulkarni updated HIVE-4342: - Description: UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. NOTE: you will need to update path to sourceData1.txt and sourceData2.txt in the HiveCommands.txt to suit your environment. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. was: UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. NPE for query involving UNION ALL with nested JOIN and UNION ALL Key: HIVE-4342 URL: https://issues.apache.org/jira/browse/HIVE-4342 Project: Hive Issue Type: Bug Components: Logging, Metastore, Query Processor Affects Versions: 0.9.0 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0 Reporter: Mihir Kulkarni Assignee: Navis Priority: Critical Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, sourceData2.txt UNION ALL query with JOIN in first part and another UNION ALL in second part gives NPE. bq. JOIN UNION ALL bq. UNION ALL Attachments: 1. HiveCommands.txt : command script to setup schema for query under consideration. 2. sourceData1.txt and sourceData2.txt : required for above command script. 3. Query.txt : Exact query which produces NPE. NOTE: you will need to update path to sourceData1.txt and sourceData2.txt in the HiveCommands.txt to suit your environment. Attached files contain the schema and exact query which fails on Hive 0.9. It is worthwhile to note that the same query executes successfully on Hive 0.7. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4384) Implement vectorized string functions UPPER(), LOWER(), LENGTH()
[ https://issues.apache.org/jira/browse/HIVE-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4384: -- Summary: Implement vectorized string functions UPPER(), LOWER(), LENGTH() (was: Implement vectorized string functions UPPER() and LOWER()) Implement vectorized string functions UPPER(), LOWER(), LENGTH() Key: HIVE-4384 URL: https://issues.apache.org/jira/browse/HIVE-4384 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3129) Create windows native scripts (CMD files) to run hive on windows without Cygwin
[ https://issues.apache.org/jira/browse/HIVE-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated HIVE-3129: -- Attachment: HIVE-3129.unittest.patch Create windows native scripts (CMD files) to run hive on windows without Cygwin Key: HIVE-3129 URL: https://issues.apache.org/jira/browse/HIVE-3129 Project: Hive Issue Type: Bug Components: CLI, Windows Affects Versions: 0.11.0 Reporter: Kanna Karanam Labels: Windows Attachments: HIVE-3129.1.patch, HIVE-3129.2.patch, HIVE-3129.unittest.patch Create the cmd files equivalent to a)Bin\hive b)Bin\hive-config.sh c)Bin\Init-hive-dfs.sh d)Bin\ext\cli.sh e)Bin\ext\debug.sh f)Bin\ext\help.sh g)Bin\ext\hiveserver.sh h)Bin\ext\jar.sh i)Bin\ext\hwi.sh j)Bin\ext\lineage.sh k)Bin\ext\metastore.sh l)Bin\ext\rcfilecat.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4378) Counters hit performance even when not used
[ https://issues.apache.org/jira/browse/HIVE-4378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4378: --- Resolution: Fixed Fix Version/s: (was: 0.11.0) 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! Counters hit performance even when not used --- Key: HIVE-4378 URL: https://issues.apache.org/jira/browse/HIVE-4378 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4378.1.patch preprocess/postprocess counters perform a number of computations even when there are no counters to update. Performance runs are captured in: https://issues.apache.org/jira/browse/HIVE-4318 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted
[ https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637097#comment-13637097 ] Gang Tim Liu commented on HIVE-4310: +1 optimize count(distinct) with hive.map.groupby.sorted - Key: HIVE-4310 URL: https://issues.apache.org/jira/browse/HIVE-4310 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat, hive.4310.2.patch-nohcat, hive.4310.3.patch-nohcat, hive.4310.4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used
[ https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4318: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! OperatorHooks hit performance even when not used Key: HIVE-4318 URL: https://issues.apache.org/jira/browse/HIVE-4318 Project: Hive Issue Type: Bug Components: Query Processor Environment: Ubuntu LXC (64 bit) Reporter: Gopal V Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, HIVE-4318.3.patch, HIVE-4318.patch.pam.txt Operator Hooks inserted into Operator.java cause a performance hit even when it is not being used. For a count(1) query tested with without the operator hook calls. {code:title=with} 2013-04-09 07:33:58,920 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 84.07 sec Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec OK 28800991 Time taken: 40.407 seconds, Fetched: 1 row(s) {code} {code:title=without} 2013-04-09 07:33:02,355 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 68.48 sec ... Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec OK 28800991 Time taken: 35.907 seconds, Fetched: 1 row(s) {code} The effect is multiplied by the number of operators in the pipeline that has to forward the row - the more operators there are the, the slower the query. The modification made to test this was {code:title=Operator.java} --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws HiveException { return; } OperatorHookContext opHookContext = new OperatorHookContext(this, row, tag); -preProcessCounter(); -enterOperatorHooks(opHookContext); +//preProcessCounter(); +//enterOperatorHooks(opHookContext); processOp(row, tag); -exitOperatorHooks(opHookContext); -postProcessCounter(); +//exitOperatorHooks(opHookContext); +//postProcessCounter(); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4103) Remove System.gc() call from the map-join local-task loop
[ https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637109#comment-13637109 ] Ashutosh Chauhan commented on HIVE-4103: Thanks, Gunther for running experiments. Difference of 56 vs 120 seconds is quite substantial. I agree, we should move ahead with the patch. +1 Remove System.gc() call from the map-join local-task loop - Key: HIVE-4103 URL: https://issues.apache.org/jira/browse/HIVE-4103 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-4103.patch Hive's HashMapWrapper calls System.gc() twice within the HashMapWrapper::isAbort() which produces a significant slow-down during the loop. {code} 2013-03-01 04:54:28 The gc calls took 677 ms 2013-03-01 04:54:28 Processing rows:20 Hashtable size: 19 Memory usage: 62955432rate: 0.033 2013-03-01 04:54:31 The gc calls took 956 ms 2013-03-01 04:54:31 Processing rows:30 Hashtable size: 29 Memory usage: 90826656rate: 0.048 2013-03-01 04:54:33 The gc calls took 967 ms 2013-03-01 04:54:33 Processing rows:384160 Hashtable size: 384160 Memory usage: 114412712 rate: 0.06 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #352
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/352/
[jira] [Commented] (HIVE-4178) ORC fails with files with different numbers of columns
[ https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637116#comment-13637116 ] Hudson commented on HIVE-4178: -- Integrated in Hive-trunk-hadoop2 #166 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/166/]) HIVE-4178 : ORC fails with files with different numbers of columns (Revision 1469908) Result = FAILURE omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1469908 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java * /hive/trunk/ql/src/test/queries/clientpositive/orc_diff_part_cols.q * /hive/trunk/ql/src/test/queries/clientpositive/orc_empty_files.q * /hive/trunk/ql/src/test/results/clientpositive/orc_diff_part_cols.q.out * /hive/trunk/ql/src/test/results/clientpositive/orc_empty_files.q.out ORC fails with files with different numbers of columns -- Key: HIVE-4178 URL: https://issues.apache.org/jira/browse/HIVE-4178 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4178.1.patch.txt When CombineHiveInputFormat is used, it's possible that two files with different numbers of files can be included in the same split, in which case Hive will fail at one of several points with an ArrayIndexOutOfBoundsException. This can happen when a partition contains empty files or two partitions are read with different numbers of columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira