[jira] [Updated] (HIVE-4382) Fix offline build mode

2013-04-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4382:
-

Attachment: HIVE-4382.1.patch

Work in progress. Doesn't work with hcatalog yet (patch disables it)

 Fix offline build mode
 --

 Key: HIVE-4382
 URL: https://issues.apache.org/jira/browse/HIVE-4382
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
 Attachments: HIVE-4382.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4200) Consolidate submodule dependencies using ivy inheritance

2013-04-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4200:
-

Attachment: HIVE-4200.3.patch

 Consolidate submodule dependencies using ivy inheritance
 

 Key: HIVE-4200
 URL: https://issues.apache.org/jira/browse/HIVE-4200
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4200.1.patch.txt, HIVE-4200.2.patch, 
 HIVE-4200.3.patch


 As discussed in 4187:
 For easier maintenance of ivy dependencies across submodules: Create parent 
 ivy file with consolidated dependencies and include into submodules via 
 inheritance. This way we're not relying on transitive dependencies, but also 
 have the dependencies in a single place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4095) Add exchange partition in Hive

2013-04-19 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4095:
-

Status: Open  (was: Patch Available)

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Dheeraj Kumar Singh
 Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, 
 HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, 
 HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4266) Refactor HCatalog code to org.apache.hive.hcatalog

2013-04-19 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636231#comment-13636231
 ] 

Carl Steinbach commented on HIVE-4266:
--

bq. We cannot make this kind of backwards incompatible change for users. Users 
will not see this as here, run this script against your source tree. They'll 
see it as they have to go modify, re-test, and re-deploy every application.

Aren't these same users going to have to re-test and re-deploy every 
application when they bump the version number of their hcatalog dependency?

bq. We should not make this a blocker for 0.11. I'm 90% of the way through the 
patch, but it will take a fair amount of testing when I'm done to asure that it 
works with both org.apache.hcatalog and org.apache.hive.hcatalog.

I'm convinced that if we don't do this now it's never going to happen, which is 
why I think one of the exit criteria for 0.11.0 needs to be either a) providing 
wrappers and a clearly stated EOL timeline for the org.apache.hcatalog 
namespace, or b) changing the package names only. 


 Refactor HCatalog code to org.apache.hive.hcatalog
 --

 Key: HIVE-4266
 URL: https://issues.apache.org/jira/browse/HIVE-4266
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0


 Currently HCatalog code is in packages org.apache.hcatalog.  It needs to now 
 move to org.apache.hive.hcatalog.  Shell classes/interface need to be created 
 for public facing classes so that user's code does not break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-19 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636294#comment-13636294
 ] 

Phabricator commented on HIVE-3509:
---

njain has commented on the revision HIVE-3509 [jira] Exclusive locks are not 
acquired when using dynamic partitions.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java:144 This is 
a incompatible change, and may break many existing apps.

  For eg: in FB we log the query along with inputs and outputs, and this will 
leave the
  burden on the client to change / to @ appropriately.

  Although it is not ideal, but let us stick with the format:

  db@table@partns. where partitions is of partitionCol1/partitionCol2

REVISION DETAIL
  https://reviews.facebook.net/D10065

To: JIRA, MattMartin
Cc: njain


 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, 
 HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-19 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636297#comment-13636297
 ] 

Namit Jain commented on HIVE-3509:
--

comments

 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, 
 HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4095) Add exchange partition in Hive

2013-04-19 Thread Dheeraj Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636376#comment-13636376
 ] 

Dheeraj Kumar Singh commented on HIVE-4095:
---

@Namit: Did you patch both the files here?

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Dheeraj Kumar Singh
 Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, 
 HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, 
 HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4356 - remove duplicate impersonation parameters for hiveserver2

2013-04-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10554/#review19455
---

Ship it!


+1

- Ashutosh Chauhan


On April 16, 2013, 9:46 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/10554/
 ---
 
 (Updated April 16, 2013, 9:46 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 remove duplicate impersonation parameters for hiveserver2
 
 
 This addresses bug HIVE-4356.
 https://issues.apache.org/jira/browse/HIVE-4356
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 78d9cc9 
   conf/hive-default.xml.template e266ce7 
   service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java 18d4aae 
   service/src/java/org/apache/hive/service/cli/CLIService.java b53599b 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 43d79aa 
   service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 
 PRE-CREATION 
   
 service/src/test/org/apache/hive/service/cli/thrift/TestThriftCLIService.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/10554/diff/
 
 
 Testing
 ---
 
 Unit tests included.
 Manually tested on (kerberos) secure and unsecure cluster.
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Commented] (HIVE-4356) remove duplicate impersonation parameters for hiveserver2

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636430#comment-13636430
 ] 

Ashutosh Chauhan commented on HIVE-4356:


+1

 remove duplicate impersonation parameters for hiveserver2
 -

 Key: HIVE-4356
 URL: https://issues.apache.org/jira/browse/HIVE-4356
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.11.0

 Attachments: HIVE-4356.1.patch


 There are two parameters controlling impersonation in hiveserver2. 
 hive.server2.enable.doAs that controls this in kerberos secure mode, while 
 hive.server2.enable.doAs controls this for unsecure mode.
 We should have just one for both modes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4106) SMB joins fail in multi-way joins

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636443#comment-13636443
 ] 

Ashutosh Chauhan commented on HIVE-4106:


Isn't HIVE-4371 a proper fix for it?
Does the test-case still fails after applying HIVE-4371?

 SMB joins fail in multi-way joins
 -

 Key: HIVE-4106
 URL: https://issues.apache.org/jira/browse/HIVE-4106
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: auto_sortmerge_join_12.q, hive.4106.1.patch, 
 hive.4106.2.patch, HIVE-4106.patch


 I see array out of bounds exception in case of multi way smb joins. This is 
 related to changes that went in as part of HIVE-3403. This issue has been 
 discussed in HIVE-3891.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636446#comment-13636446
 ] 

Ashutosh Chauhan commented on HIVE-4333:


[~rhbutani] Can you create phabricator entry for this? Since its a huge patch, 
its hard to read diff file.

 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636456#comment-13636456
 ] 

Ashutosh Chauhan commented on HIVE-4333:


bq. There are diffs because of precision. Some of the avg and sum functions are 
now wrapped in 'round'
I didn't get this part. All this computation is within Hive, it shouldn't be 
affected by hadoop version. wrapped in 'round' ? in Hive or Hadoop?

bq. Looks like the shuffle in 2.0 reorders the rows even in this case.
Yeah thats possible. Since in over() partitioning is by constant so all rows 
have same value for partitioning column so they can arrive in any order. We 
need to come up with clever way of writing test which still test over() but 
gives ordered result for both hadoop 1 and hadoop2


 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4333:
---

Affects Version/s: 0.11.0

 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4333:
---

Component/s: PTF-Windowing

 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4304) Remove unused builtins and pdk submodules

2013-04-19 Thread Travis Crawford (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Crawford updated HIVE-4304:
--

Attachment: HIVE-4304.patch

 Remove unused builtins and pdk submodules
 -

 Key: HIVE-4304
 URL: https://issues.apache.org/jira/browse/HIVE-4304
 Project: Hive
  Issue Type: Improvement
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-4304.1.patch, HIVE-4304.patch


 Moving from email. The 
 [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and 
 [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not 
 believed to be in use and should be removed. The main benefits are 
 simplification and maintainability of the Hive code base.
 Forwarded conversation
 Subject: builtins submodule - is it still needed?
 
 From: Travis Crawford traviscrawf...@gmail.com
 Date: Thu, Apr 4, 2013 at 2:01 PM
 To: u...@hive.apache.org, dev@hive.apache.org
 Hey hive gurus -
 Is the builtins hive submodule in use? The submodule was added in
 HIVE-2523 as a location for builtin-UDFs, but it appears to not have
 taken off. Any objections to removing it?
 DETAILS
 For HIVE-4278 I'm making some build changes for the HCatalog
 integration. The builtins submodule causes issues because it delays
 building until the packaging phase - so HCatalog can't depend on
 builtins, which it does transitively.
 While investigating a path forward I discovered the builtins
 submodule contains very little code, and likely could either go away
 entirely or merge into ql, simplifying things both for users and
 developers.
 Thoughts? Can anyone with context help me understand builtins, both
 in general and around its non-standard build? For your trouble I'll
 either make the submodule go away/merge into another submodule, or
 update the docs with what we learn.
 Thanks!
 Travis
 --
 From: Ashutosh Chauhan ashutosh.chau...@gmail.com
 Date: Fri, Apr 5, 2013 at 3:10 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org u...@hive.apache.org
 I haven't used it myself anytime till now. Neither have met anyone who used
 it or plan to use it.
 Ashutosh
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Gunther Hagleitner ghagleit...@hortonworks.com
 Date: Fri, Apr 5, 2013 at 3:11 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org
 +1
 I would actually go a step further and propose to remove both PDK and
 builtins. I've went through the code for both and here is what I found:
 Builtins:
 - BuiltInUtils.java: Empty file
 - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was
 intended as a building block for PDK
 PDK:
 - some helper build.xml/test setup + teardown scripts
 - Classes/annotations to help run unit tests
 - rot13 as an example
 From what I can tell it's a fair assessment that it hasn't taken off, last
 commits to it seem to have happened more than 1.5 years ago.
 Thanks,
 Gunther.
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Owen O'Malley omal...@apache.org
 Date: Fri, Apr 5, 2013 at 4:45 PM
 To: u...@hive.apache.org
 +1 to removing them. 
 We have a Rot13 example in 
 ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java 
 anyways. *smile*
 -- Owen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4304) Remove unused builtins and pdk submodules

2013-04-19 Thread Travis Crawford (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Crawford updated HIVE-4304:
--

Status: Patch Available  (was: Open)

 Remove unused builtins and pdk submodules
 -

 Key: HIVE-4304
 URL: https://issues.apache.org/jira/browse/HIVE-4304
 Project: Hive
  Issue Type: Improvement
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-4304.1.patch, HIVE-4304.patch


 Moving from email. The 
 [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and 
 [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not 
 believed to be in use and should be removed. The main benefits are 
 simplification and maintainability of the Hive code base.
 Forwarded conversation
 Subject: builtins submodule - is it still needed?
 
 From: Travis Crawford traviscrawf...@gmail.com
 Date: Thu, Apr 4, 2013 at 2:01 PM
 To: u...@hive.apache.org, dev@hive.apache.org
 Hey hive gurus -
 Is the builtins hive submodule in use? The submodule was added in
 HIVE-2523 as a location for builtin-UDFs, but it appears to not have
 taken off. Any objections to removing it?
 DETAILS
 For HIVE-4278 I'm making some build changes for the HCatalog
 integration. The builtins submodule causes issues because it delays
 building until the packaging phase - so HCatalog can't depend on
 builtins, which it does transitively.
 While investigating a path forward I discovered the builtins
 submodule contains very little code, and likely could either go away
 entirely or merge into ql, simplifying things both for users and
 developers.
 Thoughts? Can anyone with context help me understand builtins, both
 in general and around its non-standard build? For your trouble I'll
 either make the submodule go away/merge into another submodule, or
 update the docs with what we learn.
 Thanks!
 Travis
 --
 From: Ashutosh Chauhan ashutosh.chau...@gmail.com
 Date: Fri, Apr 5, 2013 at 3:10 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org u...@hive.apache.org
 I haven't used it myself anytime till now. Neither have met anyone who used
 it or plan to use it.
 Ashutosh
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Gunther Hagleitner ghagleit...@hortonworks.com
 Date: Fri, Apr 5, 2013 at 3:11 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org
 +1
 I would actually go a step further and propose to remove both PDK and
 builtins. I've went through the code for both and here is what I found:
 Builtins:
 - BuiltInUtils.java: Empty file
 - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was
 intended as a building block for PDK
 PDK:
 - some helper build.xml/test setup + teardown scripts
 - Classes/annotations to help run unit tests
 - rot13 as an example
 From what I can tell it's a fair assessment that it hasn't taken off, last
 commits to it seem to have happened more than 1.5 years ago.
 Thanks,
 Gunther.
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Owen O'Malley omal...@apache.org
 Date: Fri, Apr 5, 2013 at 4:45 PM
 To: u...@hive.apache.org
 +1 to removing them. 
 We have a Rot13 example in 
 ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java 
 anyways. *smile*
 -- Owen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4095) Add exchange partition in Hive

2013-04-19 Thread Dheeraj Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636497#comment-13636497
 ] 

Dheeraj Kumar Singh commented on HIVE-4095:
---

[~namit]: The revision 10035 does not include the thrift generated changes.

Phabricator won't allow me to upload the thrift generated changes as they are 
quite large. I've included these in the patch HIVE-4095.part12.patch.txt 
uploaded here.

 Add exchange partition in Hive
 --

 Key: HIVE-4095
 URL: https://issues.apache.org/jira/browse/HIVE-4095
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Dheeraj Kumar Singh
 Attachments: hive.4095.1.patch, HIVE-4095.D10155.1.patch, 
 HIVE-4095.D10155.2.patch, HIVE-4095.D10347.1.patch, 
 HIVE-4095.part11.patch.txt, HIVE-4095.part12.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4178) ORC fails with files with different numbers of columns

2013-04-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4178:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk and branch-11. Thanks, Kevin!

 ORC fails with files with different numbers of columns
 --

 Key: HIVE-4178
 URL: https://issues.apache.org/jira/browse/HIVE-4178
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-4178.1.patch.txt


 When CombineHiveInputFormat is used, it's possible that two files with 
 different numbers of files can be included in the same split, in which case 
 Hive will fail at one of several points with an 
 ArrayIndexOutOfBoundsException.
 This can happen when a partition contains empty files or two partitions are 
 read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4305) Use a single system for dependency resolution

2013-04-19 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636534#comment-13636534
 ] 

Owen O'Malley commented on HIVE-4305:
-

Carl,
  Rather than debate it theoretically or compare it to Hadoop, which has a 
*LOT* more complexity in their build, I propose that we have Travis make a 
Maven build file for the combined Hive and HCat systems. Then we can debate the 
value and issues in the particular patch and how to move the project forward. 
The current state is painful with extremely long builds. We need to move 
forward and enable the project to evolve quickly so that Hive can compete with 
its many comercial competitors.

 Use a single system for dependency resolution
 -

 Key: HIVE-4305
 URL: https://issues.apache.org/jira/browse/HIVE-4305
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure, HCatalog
Reporter: Travis Crawford
Assignee: Carl Steinbach

 Both Hive and HCatalog use ant as their build tool. However, Hive uses ivy 
 for dependency resolution while HCatalog uses maven-ant-tasks. With the 
 project merge we should converge on a single tool for dependency resolution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4333:
--

Attachment: HIVE-4333.D10389.1.patch

hbutani requested code review of HIVE-4333 [jira] most windowing tests fail on 
hadoop 2.

Reviewers: JIRA, ashutoshc

fix tests for hadoop 2

Problem is different order of results on hadoop 2

TEST PLAN
  change existing tests

REVISION DETAIL
  https://reviews.facebook.net/D10389

AFFECTED FILES
  data/files/flights_tiny.txt
  data/files/part.rc
  data/files/part.seq
  ql/src/test/queries/clientpositive/leadlag.q
  ql/src/test/queries/clientpositive/ptf.q
  ql/src/test/queries/clientpositive/ptf_general_queries.q
  ql/src/test/queries/clientpositive/windowing.q
  ql/src/test/queries/clientpositive/windowing_expressions.q
  ql/src/test/queries/clientpositive/windowing_multipartitioning.q
  ql/src/test/queries/clientpositive/windowing_navfn.q
  ql/src/test/queries/clientpositive/windowing_ntile.q
  ql/src/test/queries/clientpositive/windowing_rank.q
  ql/src/test/queries/clientpositive/windowing_udaf.q
  ql/src/test/queries/clientpositive/windowing_windowspec.q
  ql/src/test/results/clientpositive/leadlag.q.out
  ql/src/test/results/clientpositive/ptf.q.out
  ql/src/test/results/clientpositive/ptf_general_queries.q.out
  ql/src/test/results/clientpositive/windowing.q.out
  ql/src/test/results/clientpositive/windowing_expressions.q.out
  ql/src/test/results/clientpositive/windowing_multipartitioning.q.out
  ql/src/test/results/clientpositive/windowing_navfn.q.out
  ql/src/test/results/clientpositive/windowing_ntile.q.out
  ql/src/test/results/clientpositive/windowing_rank.q.out
  ql/src/test/results/clientpositive/windowing_udaf.q.out
  ql/src/test/results/clientpositive/windowing_windowspec.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/24867/

To: JIRA, ashutoshc, hbutani


 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636548#comment-13636548
 ] 

Harish Butani commented on HIVE-4333:
-

I think the diffs due to precision are for the same ordering issue. Since the 
rows in the partitions are not in the same order there are differences in the 
overall sum/avg beyond 2 decimal places.

 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Gunther Hagleitner
Assignee: Matthew Weaver
 Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3509) Exclusive locks are not acquired when using dynamic partitions

2013-04-19 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636551#comment-13636551
 ] 

Phabricator commented on HIVE-3509:
---

MattMartin has commented on the revision HIVE-3509 [jira] Exclusive locks are 
not acquired when using dynamic partitions.

  For the record, I'm planning to roll back the major change in my last 
revision which acquires and releases the whole hierarchy of locks on explicit 
lock ... and unlock 

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java:144 This 
change should only affect locks.  In particular, this would make sure the lock 
paths are consistent for dummy partitions and non-dummy partitions.

  Without this change, I think a case could arise where a write query with 
dynamic partitions tries to acquire an exclusive lock on base dir in 
zookeeper/db@table@partns while a read query simultaneously tries to acquire 
a shared lock on base locking dir in zookeeper/db/table/partns. In this 
case the reader and writer would not block each other even though they should. 
I'll try to add a test case to illustrate this point.

REVISION DETAIL
  https://reviews.facebook.net/D10065

To: JIRA, MattMartin
Cc: njain


 Exclusive locks are not acquired when using dynamic partitions
 --

 Key: HIVE-3509
 URL: https://issues.apache.org/jira/browse/HIVE-3509
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
 Attachments: HIVE-3509.1.patch.txt, HIVE-3509.D10065.1.patch, 
 HIVE-3509.D10065.2.patch, HIVE-3509.D10065.3.patch, HIVE-3509.D10065.4.patch


 If locking is enabled, the acquireReadWriteLocks() method in 
 org.apache.hadoop.hive.ql.Driver iterates through all of the input and output 
 entities of the query plan and attempts to acquire the appropriate locks.  In 
 general, it should acquire SHARED locks for all of the input entities and 
 exclusive locks for all of the output entities (see the Hive wiki page on 
 [locking|https://cwiki.apache.org/confluence/display/Hive/Locking] for more 
 detailed information).
 When the query involves dynamic partitions, the situation is a little more 
 subtle.  As the Hive wiki notes (see previous link):
 {quote}
 in some cases, the list of objects may not be known - for eg. in case of 
 dynamic partitions, the list of partitions being modified is not known at 
 compile time - so, the list is generated conservatively. Since the number of 
 partitions may not be known, an exclusive lock is taken on the table, or the 
 prefix that is known.
 {quote}
 After [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781], the 
 observed behavior is no longer consistent with the behavior described above.  
 [HIVE-1781|https://issues.apache.org/jira/browse/HIVE-1781] appears to have 
 altered the logic so that SHARED locks are acquired instead of EXCLUSIVE 
 locks whenever the query involves dynamic partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4304) Remove unused builtins and pdk submodules

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636562#comment-13636562
 ] 

Ashutosh Chauhan commented on HIVE-4304:


+1 will commit if tests pass

 Remove unused builtins and pdk submodules
 -

 Key: HIVE-4304
 URL: https://issues.apache.org/jira/browse/HIVE-4304
 Project: Hive
  Issue Type: Improvement
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-4304.1.patch, HIVE-4304.patch


 Moving from email. The 
 [builtins|http://svn.apache.org/repos/asf/hive/trunk/builtins/] and 
 [pdk|http://svn.apache.org/repos/asf/hive/trunk/pdk/] submodules are not 
 believed to be in use and should be removed. The main benefits are 
 simplification and maintainability of the Hive code base.
 Forwarded conversation
 Subject: builtins submodule - is it still needed?
 
 From: Travis Crawford traviscrawf...@gmail.com
 Date: Thu, Apr 4, 2013 at 2:01 PM
 To: u...@hive.apache.org, dev@hive.apache.org
 Hey hive gurus -
 Is the builtins hive submodule in use? The submodule was added in
 HIVE-2523 as a location for builtin-UDFs, but it appears to not have
 taken off. Any objections to removing it?
 DETAILS
 For HIVE-4278 I'm making some build changes for the HCatalog
 integration. The builtins submodule causes issues because it delays
 building until the packaging phase - so HCatalog can't depend on
 builtins, which it does transitively.
 While investigating a path forward I discovered the builtins
 submodule contains very little code, and likely could either go away
 entirely or merge into ql, simplifying things both for users and
 developers.
 Thoughts? Can anyone with context help me understand builtins, both
 in general and around its non-standard build? For your trouble I'll
 either make the submodule go away/merge into another submodule, or
 update the docs with what we learn.
 Thanks!
 Travis
 --
 From: Ashutosh Chauhan ashutosh.chau...@gmail.com
 Date: Fri, Apr 5, 2013 at 3:10 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org u...@hive.apache.org
 I haven't used it myself anytime till now. Neither have met anyone who used
 it or plan to use it.
 Ashutosh
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Gunther Hagleitner ghagleit...@hortonworks.com
 Date: Fri, Apr 5, 2013 at 3:11 PM
 To: dev@hive.apache.org
 Cc: u...@hive.apache.org
 +1
 I would actually go a step further and propose to remove both PDK and
 builtins. I've went through the code for both and here is what I found:
 Builtins:
 - BuiltInUtils.java: Empty file
 - UDAFUnionMap: Merges maps. Doesn't seem to be useful by itself, but was
 intended as a building block for PDK
 PDK:
 - some helper build.xml/test setup + teardown scripts
 - Classes/annotations to help run unit tests
 - rot13 as an example
 From what I can tell it's a fair assessment that it hasn't taken off, last
 commits to it seem to have happened more than 1.5 years ago.
 Thanks,
 Gunther.
 On Thu, Apr 4, 2013 at 2:01 PM, Travis Crawford 
 traviscrawf...@gmail.comwrote:
 --
 From: Owen O'Malley omal...@apache.org
 Date: Fri, Apr 5, 2013 at 4:45 PM
 To: u...@hive.apache.org
 +1 to removing them. 
 We have a Rot13 example in 
 ql/src/test/org/apache/hadoop/hive/ql/io/udf/Rot13{In,Out}putFormat.java 
 anyways. *smile*
 -- Owen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4380) Implement Vectorized Scalar-Column expressions

2013-04-19 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4380:
--

Status: Patch Available  (was: Open)

To be applied to vectorization branch

 Implement Vectorized Scalar-Column expressions
 --

 Key: HIVE-4380
 URL: https://issues.apache.org/jira/browse/HIVE-4380
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson

 The expressions with scalar as the first operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4380) Implement Vectorized Scalar-Column expressions

2013-04-19 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4380:
--

Attachment: HIVE-4380.1.patch

 Implement Vectorized Scalar-Column expressions
 --

 Key: HIVE-4380
 URL: https://issues.apache.org/jira/browse/HIVE-4380
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson
 Attachments: HIVE-4380.1.patch


 The expressions with scalar as the first operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636677#comment-13636677
 ] 

Kevin Wilfong commented on HIVE-4340:
-

https://reviews.facebook.net/D10179

 ORC should provide raw data size
 

 Key: HIVE-4340
 URL: https://issues.apache.org/jira/browse/HIVE-4340
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong

 ORC's SerDe currently does nothing, and hence does not calculate a raw data 
 size.  WriterImpl, however, has enough information to provide one.
 WriterImpl should compute a raw data size for each row, aggregate them per 
 stripe and record it in the strip information, as RC currently does in its 
 key header, and allow the FileSinkOperator access to the size per row.
 FileSinkOperator should be able to get the raw data size from either the 
 SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Attachment: HIVE-4340.1.patch.txt

 ORC should provide raw data size
 

 Key: HIVE-4340
 URL: https://issues.apache.org/jira/browse/HIVE-4340
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4340.1.patch.txt


 ORC's SerDe currently does nothing, and hence does not calculate a raw data 
 size.  WriterImpl, however, has enough information to provide one.
 WriterImpl should compute a raw data size for each row, aggregate them per 
 stripe and record it in the strip information, as RC currently does in its 
 key header, and allow the FileSinkOperator access to the size per row.
 FileSinkOperator should be able to get the raw data size from either the 
 SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Status: Patch Available  (was: Open)

 ORC should provide raw data size
 

 Key: HIVE-4340
 URL: https://issues.apache.org/jira/browse/HIVE-4340
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4340.1.patch.txt


 ORC's SerDe currently does nothing, and hence does not calculate a raw data 
 size.  WriterImpl, however, has enough information to provide one.
 WriterImpl should compute a raw data size for each row, aggregate them per 
 stripe and record it in the strip information, as RC currently does in its 
 key header, and allow the FileSinkOperator access to the size per row.
 FileSinkOperator should be able to get the raw data size from either the 
 SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4380) Implement Vectorized Scalar-Column expressions

2013-04-19 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636678#comment-13636678
 ] 

Eric Hanson commented on HIVE-4380:
---

This patch depends on the patch for 
https://issues.apache.org/jira/browse/HIVE-4282. After that patch gets 
committed, I will create a ReviewBoard entry. Currently I can't do that because 
when I try, I get this error: 

The file 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java'
 (rd6f45d3) could not be found in the repository


 Implement Vectorized Scalar-Column expressions
 --

 Key: HIVE-4380
 URL: https://issues.apache.org/jira/browse/HIVE-4380
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson
 Attachments: HIVE-4380.1.patch


 The expressions with scalar as the first operand.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4383) Implement vectorized string column-scalar filters

2013-04-19 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-4383:
-

 Summary: Implement vectorized string column-scalar filters
 Key: HIVE-4383
 URL: https://issues.apache.org/jira/browse/HIVE-4383
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson


Create patch for implementing string columns compared with scalars as 
vectorized filters, and apply it to vectorization branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4383) Implement vectorized string column-scalar filters

2013-04-19 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-4383:
-

Assignee: Eric Hanson

 Implement vectorized string column-scalar filters
 -

 Key: HIVE-4383
 URL: https://issues.apache.org/jira/browse/HIVE-4383
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson

 Create patch for implementing string columns compared with scalars as 
 vectorized filters, and apply it to vectorization branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4384) Implement vectorized string functions UPPER() and LOWER()

2013-04-19 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-4384:
-

 Summary: Implement vectorized string functions UPPER() and LOWER()
 Key: HIVE-4384
 URL: https://issues.apache.org/jira/browse/HIVE-4384
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4385) Implement vectorized LIKE filter

2013-04-19 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-4385:
-

 Summary: Implement vectorized LIKE filter
 Key: HIVE-4385
 URL: https://issues.apache.org/jira/browse/HIVE-4385
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4386) max() and min() return NULL on partition column; distinct() returns nothing

2013-04-19 Thread Robin Morris (JIRA)
Robin Morris created HIVE-4386:
--

 Summary: max() and min() return NULL on partition column; 
distinct() returns nothing
 Key: HIVE-4386
 URL: https://issues.apache.org/jira/browse/HIVE-4386
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.8.1
Reporter: Robin Morris


partitioned_table is partitioned on year, month, day.

 select max(day) from partitioned_table where year=2013 and month=4;
spins up zero mappers, one reducer, and returns NULL.  Same for
 select min(day) from ...

 select distinct(day) from... returns nothing at all.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4386) max() and min() return NULL on partition column; distinct() returns nothing

2013-04-19 Thread Robin Morris (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Morris updated HIVE-4386:
---

Description: 
partitioned_table is partitioned on year, month, day.

 select max(day) from partitioned_table where year=2013 and month=4;
spins up zero mappers, one reducer, and returns NULL.  Same for
 select min(day) from ...

 select distinct(day) from... returns nothing at all.

Using an explicit intermediate table does work:
 create table foo_max as select day from partitioned_table where year=2013 and 
 month=4;  
 select max(day) from foo_max; drop table foo_max;
Several map-reduce jobs later, the correct answer is given.

  was:
partitioned_table is partitioned on year, month, day.

 select max(day) from partitioned_table where year=2013 and month=4;
spins up zero mappers, one reducer, and returns NULL.  Same for
 select min(day) from ...

 select distinct(day) from... returns nothing at all.



 max() and min() return NULL on partition column; distinct() returns nothing
 ---

 Key: HIVE-4386
 URL: https://issues.apache.org/jira/browse/HIVE-4386
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.8.1
Reporter: Robin Morris

 partitioned_table is partitioned on year, month, day.
  select max(day) from partitioned_table where year=2013 and month=4;
 spins up zero mappers, one reducer, and returns NULL.  Same for
  select min(day) from ...
  select distinct(day) from... returns nothing at all.
 Using an explicit intermediate table does work:
  create table foo_max as select day from partitioned_table where year=2013 
  and month=4;  
  select max(day) from foo_max; drop table foo_max;
 Several map-reduce jobs later, the correct answer is given.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2055) Hive HBase Integration issue

2013-04-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636834#comment-13636834
 ] 

Sushanth Sowmyan commented on HIVE-2055:


This works for me. Non-binding +1.

 Hive HBase Integration issue
 

 Key: HIVE-2055
 URL: https://issues.apache.org/jira/browse/HIVE-2055
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: sajith v
 Attachments: HIVE-2055.patch


 Created an external table in hive , which points to the HBase table. When 
 tried to query a column using the column name in select clause got the 
 following exception : ( java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, 
 SQLState:42000)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4282) Implement vectorized column-scalar expressions

2013-04-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4282:
---

Attachment: HIVE-4282.3.patch

The binary files were by accident, removed in the latest patch. Also uploaded 
on the review board.

 Implement vectorized column-scalar expressions
 --

 Key: HIVE-4282
 URL: https://issues.apache.org/jira/browse/HIVE-4282
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4282.1.patch, HIVE-4282.2.patch, HIVE-4282.3.patch


 Implement arithmetic expressions involving a column and a scalar with column 
 as first argument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2055) Hive HBase Integration issue

2013-04-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636889#comment-13636889
 ] 

Sushanth Sowmyan commented on HIVE-2055:


As Nick notes in HCATALOG-621, there might be more to this - I only tested for 
ddl operations. That said, setting HIVE_AUX_JARS_PATH should work for this, 
right?

 Hive HBase Integration issue
 

 Key: HIVE-2055
 URL: https://issues.apache.org/jira/browse/HIVE-2055
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: sajith v
 Attachments: HIVE-2055.patch


 Created an external table in hive , which points to the HBase table. When 
 tried to query a column using the column name in select clause got the 
 following exception : ( java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, 
 SQLState:42000)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4333) most windowing tests fail on hadoop 2

2013-04-19 Thread Matthew Weaver (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Weaver reassigned HIVE-4333:


Assignee: Harish Butani  (was: Matthew Weaver)

 most windowing tests fail on hadoop 2
 -

 Key: HIVE-4333
 URL: https://issues.apache.org/jira/browse/HIVE-4333
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Gunther Hagleitner
Assignee: Harish Butani
 Attachments: HIVE-4333.1.patch.txt, HIVE-4333.D10389.1.patch


 Problem is different order of results on hadoop 2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4248) Implement a memory manager for ORC

2013-04-19 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636917#comment-13636917
 ] 

Phabricator commented on HIVE-4248:
---

ashutoshc has accepted the revision HIVE-4248 [jira] Implement a memory 
manager for ORC.

  +1 will commit if tests pass.

REVISION DETAIL
  https://reviews.facebook.net/D9993

BRANCH
  h-4248

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, omalley
Cc: kevinwilfong


 Implement a memory manager for ORC
 --

 Key: HIVE-4248
 URL: https://issues.apache.org/jira/browse/HIVE-4248
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-4248.D9993.1.patch, HIVE-4248.D9993.2.patch, 
 HIVE-4248.D9993.4.patch


 With the large default stripe size (256MB) and dynamic partitions, it is 
 quite easy for users to run out of memory when writing ORC files. We probably 
 need a solution that keeps track of the total number of concurrent ORC 
 writers and divides the available heap space between them. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4266) Refactor HCatalog code to org.apache.hive.hcatalog

2013-04-19 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13636965#comment-13636965
 ] 

Alan Gates commented on HIVE-4266:
--

bq. Aren't these same users going to have to re-test and re-deploy every 
application when they bump the version number of their hcatalog dependency?

In my experience when you install a new Hadoop, Hive, or other tool version it 
is usually placed on a test/dev cluster for a while and users are given a 
chance to run on it, and once that proves out it is promoted to the production 
cluster(s).  There isn't usually a step to validate every application and 
obviously no need to re-deploy applications.  In this scenario users can retest 
on their schedule and decide which applications are not crucial enough to 
warrant the effort. 

On the other hand if you tell users, This is not backward compatible, you have 
to rewrite all your programs and scripts you are forcing them to rewrite, 
retest, and re-deploy everything before they can deploy the new version.  This 
puts a big barrier to uptake of the new version in their way.

I am fine with setting a sunset for these shell classes.  Two major releases 
(ie Hive 0.13 or 0.14 depending on which release they go out with).

bq. I'm convinced that if we don't do this now it's never going to happen...
Since Ashutosh is managing this release I'll defer to him, but I am concerned 
about stuffing something this large in at the last minute.  I understand that 
in software deferred is a synonym for when hell freezes over, but I 
honestly have the patch mostly done.  The unit tests are passing.  I don't have 
the javadoc doing the right thing yet and I need to run the system tests 
against both org.apache.hcatalog and org.apache.hive.hcatalog, which I'm 
estimating will take me a few days assuming I find a few bugs.

 Refactor HCatalog code to org.apache.hive.hcatalog
 --

 Key: HIVE-4266
 URL: https://issues.apache.org/jira/browse/HIVE-4266
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0


 Currently HCatalog code is in packages org.apache.hcatalog.  It needs to now 
 move to org.apache.hive.hcatalog.  Shell classes/interface need to be created 
 for public facing classes so that user's code does not break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-04-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4189:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

I just committed this to trunk and branch-0.11. Thanks, Kevin!

 ORC fails with String column that ends in lots of nulls
 ---

 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-4189.1.patch.txt, HIVE-4189.2.patch.txt


 When ORC attempts to write out a string column that ends in enough nulls to 
 span an index stride, StringTreeWriter's writeStripe method will get an 
 exception from TreeWriter's writeStripe method
 Column has wrong number of index entries found: x expected: y
 This is caused by rowIndexValueCount having multiple entries equal to the 
 number of non-null rows in the column, combined with the fact that 
 StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4365) wrong result in left semi join

2013-04-19 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4365:


Status: Patch Available  (was: Open)

 wrong result in left semi join
 --

 Key: HIVE-4365
 URL: https://issues.apache.org/jira/browse/HIVE-4365
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0, 0.9.0
Reporter: ransom.hezhiqiang
Assignee: Navis
 Attachments: HIVE-4365.D10341.1.patch, HIVE-4365.D10341.2.patch


 wrong result in left semi join while hive.optimize.ppd=true
 for example:
 1、create table
create table t1(c1 int,c2 int, c3 int, c4 int, c5 double,c6 int,c7 string) 
   row format DELIMITED FIELDS TERMINATED BY '|';
create table t2(c1 int) ;
 2、load data
 load data local inpath '/home/test/t1.txt' OVERWRITE into table t1;
 load data local inpath '/home/test/t2.txt' OVERWRITE into table t2;
 t1 data:
 1|3|10003|52|781.96|555|201203
 1|3|10003|39|782.96|555|201203
 1|3|10003|87|783.96|555|201203
 2|5|10004|24|789.96|555|201203
 2|5|10004|58|788.96|555|201203
 t2 data:
 555
 3、excute Query
 select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7  from t1 left semi join t2 
 on t1.c6 = t2.c1 and  t1.c1 =  '1' and t1.c7 = '201203' ;   
 can got result.
 select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7  from t1 left semi join t2 
 on t1.c6 = t2.c1 where t1.c1 =  '1' and t1.c7 = '201203' ;   
 can't got result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4365) wrong result in left semi join

2013-04-19 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4365:
--

Attachment: HIVE-4365.D10341.2.patch

navis updated the revision HIVE-4365 [jira] wrong result in left semi join.

  Fixed test result  passed all tests

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D10341

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D10341?vs=32361id=32505#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
  ql/src/test/queries/clientpositive/semijoin.q
  ql/src/test/results/clientpositive/semijoin.q.out
  ql/src/test/results/compiler/plan/join1.q.xml
  ql/src/test/results/compiler/plan/join2.q.xml
  ql/src/test/results/compiler/plan/join3.q.xml
  ql/src/test/results/compiler/plan/join4.q.xml
  ql/src/test/results/compiler/plan/join5.q.xml
  ql/src/test/results/compiler/plan/join6.q.xml
  ql/src/test/results/compiler/plan/join7.q.xml
  ql/src/test/results/compiler/plan/join8.q.xml

To: JIRA, navis


 wrong result in left semi join
 --

 Key: HIVE-4365
 URL: https://issues.apache.org/jira/browse/HIVE-4365
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0
Reporter: ransom.hezhiqiang
Assignee: Navis
 Attachments: HIVE-4365.D10341.1.patch, HIVE-4365.D10341.2.patch


 wrong result in left semi join while hive.optimize.ppd=true
 for example:
 1、create table
create table t1(c1 int,c2 int, c3 int, c4 int, c5 double,c6 int,c7 string) 
   row format DELIMITED FIELDS TERMINATED BY '|';
create table t2(c1 int) ;
 2、load data
 load data local inpath '/home/test/t1.txt' OVERWRITE into table t1;
 load data local inpath '/home/test/t2.txt' OVERWRITE into table t2;
 t1 data:
 1|3|10003|52|781.96|555|201203
 1|3|10003|39|782.96|555|201203
 1|3|10003|87|783.96|555|201203
 2|5|10004|24|789.96|555|201203
 2|5|10004|58|788.96|555|201203
 t2 data:
 555
 3、excute Query
 select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7  from t1 left semi join t2 
 on t1.c6 = t2.c1 and  t1.c1 =  '1' and t1.c7 = '201203' ;   
 can got result.
 select t1.c1,t1.c2,t1.c3,t1.c4,t1.c5,t1.c6,t1.c7  from t1 left semi join t2 
 on t1.c6 = t2.c1 where t1.c1 =  '1' and t1.c7 = '201203' ;   
 can't got result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis reassigned HIVE-4342:
---

Assignee: Navis

 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical
 Attachments: example.txt


 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attached file (example.txt) contains the schema and exact query which fails 
 on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4387) ant make-pom fails because hcatalog doesn't have a make-pom target

2013-04-19 Thread Alan Gates (JIRA)
Alan Gates created HIVE-4387:


 Summary: ant make-pom fails because hcatalog doesn't have a 
make-pom target
 Key: HIVE-4387
 URL: https://issues.apache.org/jira/browse/HIVE-4387
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0


Other *-pom directives probably fail as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4387) ant maven-build fails because hcatalog doesn't have a make-pom target

2013-04-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4387:
-

Summary: ant maven-build fails because hcatalog doesn't have a make-pom 
target  (was: ant make-pom fails because hcatalog doesn't have a make-pom 
target)

 ant maven-build fails because hcatalog doesn't have a make-pom target
 -

 Key: HIVE-4387
 URL: https://issues.apache.org/jira/browse/HIVE-4387
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0


 Other *-pom directives probably fail as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4387) ant maven-build fails because hcatalog doesn't have a make-pom target

2013-04-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-4387:
-

Description: Other maven-* target may fail as well.  (was: Other *-pom 
directives probably fail as well.)

 ant maven-build fails because hcatalog doesn't have a make-pom target
 -

 Key: HIVE-4387
 URL: https://issues.apache.org/jira/browse/HIVE-4387
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.11.0


 Other maven-* target may fail as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Mihir Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Kulkarni updated HIVE-4342:
-

Description: 
UNION ALL query with JOIN in first part and another UNION ALL in second part 
gives NPE.

bq. JOIN
UNION ALL
bq. UNION ALL

Attachments:
1. HiveCommands.txt : command script to setup schema for query under 
consideration.
2. sourceData1.txt and sourceData2.txt : required for above command script.
3. Query.txt : Exact query which produces NPE.

Attached files contain the schema and exact query which fails on Hive 0.9.
It is worthwhile to note that the same query executes successfully on Hive 0.7.

  was:
UNION ALL query with JOIN in first part and another UNION ALL in second part 
gives NPE.

bq. JOIN
UNION ALL
bq. UNION ALL

Attached file (example.txt) contains the schema and exact query which fails on 
Hive 0.9.
It is worthwhile to note that the same query executes successfully on Hive 0.7.



 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical

 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attachments:
 1. HiveCommands.txt : command script to setup schema for query under 
 consideration.
 2. sourceData1.txt and sourceData2.txt : required for above command script.
 3. Query.txt : Exact query which produces NPE.
 Attached files contain the schema and exact query which fails on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Mihir Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Kulkarni updated HIVE-4342:
-

Attachment: (was: example.txt)

 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical

 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attachments:
 1. HiveCommands.txt : command script to setup schema for query under 
 consideration.
 2. sourceData1.txt and sourceData2.txt : required for above command script.
 3. Query.txt : Exact query which produces NPE.
 Attached files contain the schema and exact query which fails on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Mihir Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Kulkarni updated HIVE-4342:
-

Attachment: Query.txt
sourceData2.txt
sourceData1.txt
HiveCommands.txt

 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical
 Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, 
 sourceData2.txt


 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attachments:
 1. HiveCommands.txt : command script to setup schema for query under 
 consideration.
 2. sourceData1.txt and sourceData2.txt : required for above command script.
 3. Query.txt : Exact query which produces NPE.
 Attached files contain the schema and exact query which fails on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Mihir Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637023#comment-13637023
 ] 

Mihir Kulkarni commented on HIVE-4342:
--

[~navis]
I have updated the attachments which contain the command script to generate 
schema, the data to be used with the command script and the exact query! I hope 
you are able to reproduce the NPE with this information.

 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical
 Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, 
 sourceData2.txt


 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attachments:
 1. HiveCommands.txt : command script to setup schema for query under 
 consideration.
 2. sourceData1.txt and sourceData2.txt : required for above command script.
 3. Query.txt : Exact query which produces NPE.
 Attached files contain the schema and exact query which fails on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4342) NPE for query involving UNION ALL with nested JOIN and UNION ALL

2013-04-19 Thread Mihir Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihir Kulkarni updated HIVE-4342:
-

Description: 
UNION ALL query with JOIN in first part and another UNION ALL in second part 
gives NPE.

bq. JOIN
UNION ALL
bq. UNION ALL

Attachments:
1. HiveCommands.txt : command script to setup schema for query under 
consideration.
2. sourceData1.txt and sourceData2.txt : required for above command script.
3. Query.txt : Exact query which produces NPE.

NOTE: you will need to update path to sourceData1.txt and sourceData2.txt in 
the HiveCommands.txt to suit your environment.

Attached files contain the schema and exact query which fails on Hive 0.9.
It is worthwhile to note that the same query executes successfully on Hive 0.7.

  was:
UNION ALL query with JOIN in first part and another UNION ALL in second part 
gives NPE.

bq. JOIN
UNION ALL
bq. UNION ALL

Attachments:
1. HiveCommands.txt : command script to setup schema for query under 
consideration.
2. sourceData1.txt and sourceData2.txt : required for above command script.
3. Query.txt : Exact query which produces NPE.

Attached files contain the schema and exact query which fails on Hive 0.9.
It is worthwhile to note that the same query executes successfully on Hive 0.7.


 NPE for query involving UNION ALL with nested JOIN and UNION ALL
 

 Key: HIVE-4342
 URL: https://issues.apache.org/jira/browse/HIVE-4342
 Project: Hive
  Issue Type: Bug
  Components: Logging, Metastore, Query Processor
Affects Versions: 0.9.0
 Environment: Red Hat Linux VM with Hive 0.9 and Hadoop 2.0
Reporter: Mihir Kulkarni
Assignee: Navis
Priority: Critical
 Attachments: HiveCommands.txt, Query.txt, sourceData1.txt, 
 sourceData2.txt


 UNION ALL query with JOIN in first part and another UNION ALL in second part 
 gives NPE.
 bq. JOIN
 UNION ALL
 bq. UNION ALL
 Attachments:
 1. HiveCommands.txt : command script to setup schema for query under 
 consideration.
 2. sourceData1.txt and sourceData2.txt : required for above command script.
 3. Query.txt : Exact query which produces NPE.
 NOTE: you will need to update path to sourceData1.txt and sourceData2.txt in 
 the HiveCommands.txt to suit your environment.
 Attached files contain the schema and exact query which fails on Hive 0.9.
 It is worthwhile to note that the same query executes successfully on Hive 
 0.7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4384) Implement vectorized string functions UPPER(), LOWER(), LENGTH()

2013-04-19 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4384:
--

Summary: Implement vectorized string functions UPPER(), LOWER(), LENGTH()  
(was: Implement vectorized string functions UPPER() and LOWER())

 Implement vectorized string functions UPPER(), LOWER(), LENGTH()
 

 Key: HIVE-4384
 URL: https://issues.apache.org/jira/browse/HIVE-4384
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3129) Create windows native scripts (CMD files) to run hive on windows without Cygwin

2013-04-19 Thread Xi Fang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xi Fang updated HIVE-3129:
--

Attachment: HIVE-3129.unittest.patch

 Create windows native scripts (CMD files)  to run hive on windows without 
 Cygwin
 

 Key: HIVE-3129
 URL: https://issues.apache.org/jira/browse/HIVE-3129
 Project: Hive
  Issue Type: Bug
  Components: CLI, Windows
Affects Versions: 0.11.0
Reporter: Kanna Karanam
  Labels: Windows
 Attachments: HIVE-3129.1.patch, HIVE-3129.2.patch, 
 HIVE-3129.unittest.patch


 Create the cmd files equivalent to 
 a)Bin\hive
 b)Bin\hive-config.sh
 c)Bin\Init-hive-dfs.sh
 d)Bin\ext\cli.sh
 e)Bin\ext\debug.sh
 f)Bin\ext\help.sh
 g)Bin\ext\hiveserver.sh
 h)Bin\ext\jar.sh
 i)Bin\ext\hwi.sh
 j)Bin\ext\lineage.sh
 k)Bin\ext\metastore.sh
 l)Bin\ext\rcfilecat.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4378) Counters hit performance even when not used

2013-04-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4378:
---

   Resolution: Fixed
Fix Version/s: (was: 0.11.0)
   0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 Counters hit performance even when not used
 ---

 Key: HIVE-4378
 URL: https://issues.apache.org/jira/browse/HIVE-4378
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4378.1.patch


 preprocess/postprocess counters perform a number of computations even when 
 there are no counters to update. Performance runs are captured in: 
 https://issues.apache.org/jira/browse/HIVE-4318

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4310) optimize count(distinct) with hive.map.groupby.sorted

2013-04-19 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637097#comment-13637097
 ] 

Gang Tim Liu commented on HIVE-4310:


+1

 optimize count(distinct) with hive.map.groupby.sorted
 -

 Key: HIVE-4310
 URL: https://issues.apache.org/jira/browse/HIVE-4310
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4310.1.patch, hive.4310.1.patch-nohcat, 
 hive.4310.2.patch-nohcat, hive.4310.3.patch-nohcat, hive.4310.4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4318:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 OperatorHooks hit performance even when not used
 

 Key: HIVE-4318
 URL: https://issues.apache.org/jira/browse/HIVE-4318
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
 Environment: Ubuntu LXC (64 bit)
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, HIVE-4318.3.patch, 
 HIVE-4318.patch.pam.txt


 Operator Hooks inserted into Operator.java cause a performance hit even when 
 it is not being used.
 For a count(1) query tested with  without the operator hook calls.
 {code:title=with}
 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 84.07 sec
 Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
 OK
 28800991
 Time taken: 40.407 seconds, Fetched: 1 row(s)
 {code}
 {code:title=without}
 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 68.48 sec
 ...
 Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
 OK
 28800991
 Time taken: 35.907 seconds, Fetched: 1 row(s)
 {code}
 The effect is multiplied by the number of operators in the pipeline that has 
 to forward the row - the more operators there are the, the slower the query.
 The modification made to test this was 
 {code:title=Operator.java}
 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
 @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
 HiveException {
return;
  }
  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
 tag);
 -preProcessCounter();
 -enterOperatorHooks(opHookContext);
 +//preProcessCounter();
 +//enterOperatorHooks(opHookContext);
  processOp(row, tag);
 -exitOperatorHooks(opHookContext);
 -postProcessCounter();
 +//exitOperatorHooks(opHookContext);
 +//postProcessCounter();
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4103) Remove System.gc() call from the map-join local-task loop

2013-04-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637109#comment-13637109
 ] 

Ashutosh Chauhan commented on HIVE-4103:


Thanks, Gunther for running experiments. Difference of 56 vs 120 seconds is 
quite substantial. I agree, we should move ahead with the patch. 
+1

 Remove System.gc() call from the map-join local-task loop
 -

 Key: HIVE-4103
 URL: https://issues.apache.org/jira/browse/HIVE-4103
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-4103.patch


 Hive's HashMapWrapper calls System.gc() twice within the 
 HashMapWrapper::isAbort() which produces a significant slow-down during the 
 loop.
 {code}
 2013-03-01 04:54:28 The gc calls took 677 ms
 2013-03-01 04:54:28 Processing rows:20  Hashtable size: 
 19  Memory usage:   62955432rate:   0.033
 2013-03-01 04:54:31 The gc calls took 956 ms
 2013-03-01 04:54:31 Processing rows:30  Hashtable size: 
 29  Memory usage:   90826656rate:   0.048
 2013-03-01 04:54:33 The gc calls took 967 ms
 2013-03-01 04:54:33 Processing rows:384160  Hashtable size: 
 384160  Memory usage:   114412712   rate:   0.06
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #352

2013-04-19 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/352/



[jira] [Commented] (HIVE-4178) ORC fails with files with different numbers of columns

2013-04-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637116#comment-13637116
 ] 

Hudson commented on HIVE-4178:
--

Integrated in Hive-trunk-hadoop2 #166 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/166/])
HIVE-4178 : ORC fails with files with different numbers of columns 
(Revision 1469908)

 Result = FAILURE
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1469908
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
* /hive/trunk/ql/src/test/queries/clientpositive/orc_diff_part_cols.q
* /hive/trunk/ql/src/test/queries/clientpositive/orc_empty_files.q
* /hive/trunk/ql/src/test/results/clientpositive/orc_diff_part_cols.q.out
* /hive/trunk/ql/src/test/results/clientpositive/orc_empty_files.q.out


 ORC fails with files with different numbers of columns
 --

 Key: HIVE-4178
 URL: https://issues.apache.org/jira/browse/HIVE-4178
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-4178.1.patch.txt


 When CombineHiveInputFormat is used, it's possible that two files with 
 different numbers of files can be included in the same split, in which case 
 Hive will fail at one of several points with an 
 ArrayIndexOutOfBoundsException.
 This can happen when a partition contains empty files or two partitions are 
 read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira