[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403290#comment-13403290
 ] 

Jitendra Nath Pandey commented on HIVE-3098:


bq. Problem stems from the fact that there is no expiration policy either in fs 
or ugi cache. We need to design for UGI cache eviction policy. There, when we 
are expiring stale ugi's from ugi-cache we can do closeAllForUGI for evicting 
ugi to evict cached FS objects from fs-cache.

+1. It may be more tractable to have a cache expiration policy in ugi-cache 
based on the semantics of this particular use case. In FS-cache it gets 
trickier because of the general purpose nature of the file system.

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-4575) In place filtering in Not Filter doesn't handle nulls correctly.

2013-05-17 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4575:
--

 Summary: In place filtering in Not Filter doesn't handle nulls 
correctly.
 Key: HIVE-4575
 URL: https://issues.apache.org/jira/browse/HIVE-4575
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


 The FilterNotExpr evaluates the child expression and takes the compliment of 
the selected vector. Since child expression filters out null values, the 
compliment includes the nulls in the output. This is incorrect because 
not(null) = null.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4575) In place filtering in Not Filter doesn't handle nulls correctly.

2013-05-17 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661102#comment-13661102
 ] 

Jitendra Nath Pandey commented on HIVE-4575:


bq. I think the repro here in our code is that you'd get 1  NULL and 3  NULL 
returned.

Yes.

Another point is that output of

  select * from t where NOT (a = 2);

should be same as

  select * from t where (a  2);

In our current implementation first query will return row 1 and 4, while second 
will return only row 1.



 In place filtering in Not Filter doesn't handle nulls correctly.
 

 Key: HIVE-4575
 URL: https://issues.apache.org/jira/browse/HIVE-4575
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

  The FilterNotExpr evaluates the child expression and takes the compliment of 
 the selected vector. Since child expression filters out null values, the 
 compliment includes the nulls in the output. This is incorrect because 
 not(null) = null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4534) IsNotNull and NotCol incorrectly handle nulls.

2013-05-20 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4534:
---

Attachment: HIVE-4534.2.patch

Attached patch has additional unit tests, could not create review board entry 
because this patch is on top of HIVE-4472 patch.

 IsNotNull and NotCol incorrectly handle nulls.
 --

 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4534.1.patch, HIVE-4534.2.patch


 See file IsNotNull.java in package 
 org.apache.hadoop.hive.ql.exec.vector.expressions
 It never looks at the noNulls flag on the input vector, but accesses the 
 isNull[] array anyway. This can yield incorrect results.
 isRepeating and noNulls are not set in the output, which can also cause wrong 
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4534) IsNotNull and NotCol incorrectly handle nulls.

2013-05-20 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4534:
---

Affects Version/s: vectorization-branch
   Status: Patch Available  (was: Open)

 IsNotNull and NotCol incorrectly handle nulls.
 --

 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4534.1.patch, HIVE-4534.2.patch


 See file IsNotNull.java in package 
 org.apache.hadoop.hive.ql.exec.vector.expressions
 It never looks at the noNulls flag on the input vector, but accesses the 
 isNull[] array anyway. This can yield incorrect results.
 isRepeating and noNulls are not set in the output, which can also cause wrong 
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4534) IsNotNull and NotCol incorrectly handle nulls.

2013-05-20 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4534:
---

Affects Version/s: (was: vectorization-branch)

 IsNotNull and NotCol incorrectly handle nulls.
 --

 Key: HIVE-4534
 URL: https://issues.apache.org/jira/browse/HIVE-4534
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4534.1.patch, HIVE-4534.2.patch


 See file IsNotNull.java in package 
 org.apache.hadoop.hive.ql.exec.vector.expressions
 It never looks at the noNulls flag on the input vector, but accesses the 
 isNull[] array anyway. This can yield incorrect results.
 isRepeating and noNulls are not set in the output, which can also cause wrong 
 results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)

2013-05-20 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4472:
---

Status: Patch Available  (was: Open)

 OR, NOT Filter logic can lose an array, and always takes time 
 O(VectorizedRowBatch.DEFAULT_SIZE)
 

 Key: HIVE-4472
 URL: https://issues.apache.org/jira/browse/HIVE-4472
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, 
 HIVE-4472.4.patch


 The issue is in file FilterExprOrExpr.java and FilterNotExpr.java.
 I posted a review for you at 
 https://reviews.apache.org/r/10752/
 I think there is a bug related to sharing of an array of integers. Also, one 
 algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If 
 nDEFAULT_BATCH_SIZE then this is a performance issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4537) select * fails on orc table when vectorization is enabled

2013-05-21 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4537:
---

Status: Patch Available  (was: Open)

 select * fails on orc table when vectorization is enabled 
 --

 Key: HIVE-4537
 URL: https://issues.apache.org/jira/browse/HIVE-4537
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Tony Murphy
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4537.0.patch


 hive select * from intdataorc;
 OK
 Failed with exception 
 java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
 evaluating cint0
 Time taken: 0.213 seconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)

2013-05-22 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4472:
---

Attachment: HIVE-4472.5.patch

Same patch as previous one except that the fix to TestConstantVectorExpression 
is removed, because that is taken care of by HIVE-4553.

 OR, NOT Filter logic can lose an array, and always takes time 
 O(VectorizedRowBatch.DEFAULT_SIZE)
 

 Key: HIVE-4472
 URL: https://issues.apache.org/jira/browse/HIVE-4472
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, 
 HIVE-4472.4.patch, HIVE-4472.5.patch


 The issue is in file FilterExprOrExpr.java and FilterNotExpr.java.
 I posted a review for you at 
 https://reviews.apache.org/r/10752/
 I think there is a bug related to sharing of an array of integers. Also, one 
 algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If 
 nDEFAULT_BATCH_SIZE then this is a performance issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results

2013-05-22 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664817#comment-13664817
 ] 

Jitendra Nath Pandey commented on HIVE-4592:


Same issue exists in many other templates. I think we should fix them too in 
the same jira.
Also, most of these templates assume that noNulls=false and isRepeating=true 
means all values are null.

 ColumnArithmeticColumn.txt template never sets output isNull to true; can 
 give wrong results
 

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4599) VectorGroupByOperator steals the non-vectorized children and crashes query if vectorization fails

2013-05-23 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13665549#comment-13665549
 ] 

Jitendra Nath Pandey commented on HIVE-4599:


I will recommend putting in VectorReduceSinkOperator, because that will make 
vectorized map side and non-vectorize reduce side work together for non-GBy 
queries too.

 VectorGroupByOperator steals the non-vectorized children and crashes query if 
 vectorization fails
 -

 Key: HIVE-4599
 URL: https://issues.apache.org/jira/browse/HIVE-4599
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu

 Have the VGBy clone it's own row mode children or implement vector mode 
 output (including VectorReduceSinkOperator)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-23 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4603:
--

 Summary: VectorSelectOperator projections change the index of 
columns for subsequent operators.
 Key: HIVE-4603
 URL: https://issues.apache.org/jira/browse/HIVE-4603
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-23 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4603:
---

Attachment: HIVE-4603.1.patch

Initial patch, the unit test needs to be fixed. I will upload another patch.

 VectorSelectOperator projections change the index of columns for subsequent 
 operators.
 --

 Key: HIVE-4603
 URL: https://issues.apache.org/jira/browse/HIVE-4603
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4603.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results

2013-05-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13666031#comment-13666031
 ] 

Jitendra Nath Pandey commented on HIVE-4592:


Long-long division is handled specially, as it is cast to double division. 
These expressions are no longer generated using templates. Please add the fix 
to those too. 
  They are located in: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/

 ColumnArithmeticColumn.txt template never sets output isNull to true; can 
 give wrong results
 

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4592.1.patch, HIVE-4592.3.patch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4603:
---

Attachment: HIVE-4603.2.patch

 VectorSelectOperator projections change the index of columns for subsequent 
 operators.
 --

 Key: HIVE-4603
 URL: https://issues.apache.org/jira/browse/HIVE-4603
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4603.1.patch, HIVE-4603.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13666832#comment-13666832
 ] 

Jitendra Nath Pandey commented on HIVE-4603:


New patch with unit test fixed.

 VectorSelectOperator projections change the index of columns for subsequent 
 operators.
 --

 Key: HIVE-4603
 URL: https://issues.apache.org/jira/browse/HIVE-4603
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4603.1.patch, HIVE-4603.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4603) VectorSelectOperator projections change the index of columns for subsequent operators.

2013-05-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4603:
---

Description: VectorSelectOperator projections change the index of columns 
for subsequent operators.

 VectorSelectOperator projections change the index of columns for subsequent 
 operators.
 --

 Key: HIVE-4603
 URL: https://issues.apache.org/jira/browse/HIVE-4603
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4603.1.patch, HIVE-4603.2.patch


 VectorSelectOperator projections change the index of columns for subsequent 
 operators.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc files.

2013-05-31 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4640:
---

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-4160

 CommonOrcInputFormat should be the default input format for Orc files.
 --

 Key: HIVE-4640
 URL: https://issues.apache.org/jira/browse/HIVE-4640
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

 CommonOrcInputFormat should be the default input format for Orc files, so 
 that default orc format tables work with both vectorized and non-vectorized 
 path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc files.

2013-05-31 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4640:
--

 Summary: CommonOrcInputFormat should be the default input format 
for Orc files.
 Key: HIVE-4640
 URL: https://issues.apache.org/jira/browse/HIVE-4640
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


CommonOrcInputFormat should be the default input format for Orc files, so that 
default orc format tables work with both vectorized and non-vectorized path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4640) CommonOrcInputFormat should be the default input format for Orc tables.

2013-05-31 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4640:
---

Summary: CommonOrcInputFormat should be the default input format for Orc 
tables.  (was: CommonOrcInputFormat should be the default input format for Orc 
files.)

 CommonOrcInputFormat should be the default input format for Orc tables.
 ---

 Key: HIVE-4640
 URL: https://issues.apache.org/jira/browse/HIVE-4640
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

 CommonOrcInputFormat should be the default input format for Orc files, so 
 that default orc format tables work with both vectorized and non-vectorized 
 path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation

2013-06-03 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4649:
--

 Summary: Unit test failure in 
TestColumnScalarOperationVectorExpressionEvaluation 
 Key: HIVE-4649
 URL: https://issues.apache.org/jira/browse/HIVE-4649
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


The test fails due to bug in ColumnCompareScalar.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation

2013-06-03 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4649:
---

Attachment: HIVE-4649.1.patch

Attached patch fixes the issue.

 Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation 
 -

 Key: HIVE-4649
 URL: https://issues.apache.org/jira/browse/HIVE-4649
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4649.1.patch


 The test fails due to bug in ColumnCompareScalar.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4655:
--

 Summary: Vectorization not working with negative constants, hive 
doesn't fold constants.
 Key: HIVE-4655
 URL: https://issues.apache.org/jira/browse/HIVE-4655
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


  Hive optimizer doesn't fold the constants, however vectorized code path 
assumes that constants have been folded. This should be fixed in hive 
optimizer. 
  In this jira we just fix vectorization path to handle folding for negative 
constants. This is needed because hive plan treats negative constants as 
unary-minus expression on constants, therefore these expressions also need 
constant folding.
This fix will become redundant once constant folding is appropriately 
implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4655:
---

Attachment: HIVE-4655.1.patch

 Vectorization not working with negative constants, hive doesn't fold 
 constants.
 ---

 Key: HIVE-4655
 URL: https://issues.apache.org/jira/browse/HIVE-4655
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4655.1.patch


   Hive optimizer doesn't fold the constants, however vectorized code path 
 assumes that constants have been folded. This should be fixed in hive 
 optimizer. 
   In this jira we just fix vectorization path to handle folding for negative 
 constants. This is needed because hive plan treats negative constants as 
 unary-minus expression on constants, therefore these expressions also need 
 constant folding.
 This fix will become redundant once constant folding is appropriately 
 implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4655) Vectorization not working with negative constants, hive doesn't fold constants.

2013-06-04 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675103#comment-13675103
 ] 

Jitendra Nath Pandey commented on HIVE-4655:


Review board entry.
https://reviews.apache.org/r/11634/

 Vectorization not working with negative constants, hive doesn't fold 
 constants.
 ---

 Key: HIVE-4655
 URL: https://issues.apache.org/jira/browse/HIVE-4655
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4655.1.patch


   Hive optimizer doesn't fold the constants, however vectorized code path 
 assumes that constants have been folded. This should be fixed in hive 
 optimizer. 
   In this jira we just fix vectorization path to handle folding for negative 
 constants. This is needed because hive plan treats negative constants as 
 unary-minus expression on constants, therefore these expressions also need 
 constant folding.
 This fix will become redundant once constant folding is appropriately 
 implemented in hive optimizer. (HIVE-746)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676826#comment-13676826
 ] 

Jitendra Nath Pandey commented on HIVE-4665:


We should use Writables from org.apache.hadoop.hive.serde2.io.* as much as 
possible. 
Writables from hadoop.io should be used only when an implementation in hive is 
not available.

Also, the strings should use Text instead of BytesWritable.

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 

[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4665:
---

Attachment: HIVE-4665.1.patch

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4665.1.patch


 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
 StringObjectInspector.java:40)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hashCode(ObjectInspectorUtils.java:481)
 at 
 

[jira] [Resolved] (HIVE-4653) Favor serde2.io Writable classes over hadoop.io ones

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4653.


Resolution: Duplicate

HIVE-4665 will fix this.

 Favor serde2.io Writable classes over hadoop.io ones
 

 Key: HIVE-4653
 URL: https://issues.apache.org/jira/browse/HIVE-4653
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor

 The Writables are originally from org.apache.hadoop.io. I tend to assume 
 that they have been re-defined in hive if the original implementation was not 
 considered good enough.
 However, I don't understand why some are defined twice in hive itself. I 
 noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. 
 The ByteWritable in serde2.io is being referred to in bunch of places. 
 Therefore, I would suggest to just use the one in serde2.io.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4654) Remove unused org.apache.hadoop.hive.ql.exec Writables

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676842#comment-13676842
 ] 

Jitendra Nath Pandey commented on HIVE-4654:


I think this is more general than vectorization effort. We should generally 
remove unused classes.
I would suggest to remove it from subtasks of HIVE-4160 and make it a top level 
bug.

 Remove unused org.apache.hadoop.hive.ql.exec Writables
 --

 Key: HIVE-4654
 URL: https://issues.apache.org/jira/browse/HIVE-4654
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Priority: Minor

 The Writables are originally from org.apache.hadoop.io. I tend to assume that 
 they have been re-defined in hive if the original implementation was not 
 considered good enough.
 However, I don't understand why some are defined twice in hive itself. I 
 noticed that ByteWritable in o.a.h.hive.ql.exec is not being used anywhere. 
 The ByteWritable in serde2.io is being referred to in bunch of places. 
 Therefore, I would suggest to just use the one in serde2.io. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13676851#comment-13676851
 ] 

Jitendra Nath Pandey commented on HIVE-4665:


Patch uploaded.
Review board: https://reviews.apache.org/r/11666/

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4665.1.patch


 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(Writable
 StringObjectInspector.java:40)
 at 
 

[jira] [Created] (HIVE-4673) Use VectorExpessionWriter to write column vectors into Writables.

2013-06-06 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4673:
--

 Summary: Use VectorExpessionWriter to write column vectors into 
Writables.
 Key: HIVE-4673
 URL: https://issues.apache.org/jira/browse/HIVE-4673
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


VectorExpressionWriter interface should be used to write column vectors into 
Writables. VectorExpressionWriter supports all primitive datatypes and 
this will make vector select operator and vector group by operators consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4673) Use VectorExpessionWriter to write column vectors into Writables.

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4673:
---

Attachment: HIVE-4673.1.patch

 Use VectorExpessionWriter to write column vectors into Writables.
 -

 Key: HIVE-4673
 URL: https://issues.apache.org/jira/browse/HIVE-4673
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4673.1.patch


 VectorExpressionWriter interface should be used to write column vectors into 
 Writables. VectorExpressionWriter supports all primitive datatypes and 
 this will make vector select operator and vector group by operators 
 consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4673) Use VectorExpessionWriter to write column vectors into Writables.

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4673:
---

Description: VectorExpressionWriter interface should be used to write 
column vectors into Writables. VectorExpressionWriter supports all primitive 
datatypes and this will make vector select operator and vector group by 
operators consistent.  (was: VectorExpressionWriter interface should be used to 
write column vectors into Writables. VectorExpressionWriter supports all 
primitive datatypes and 
this will make vector select operator and vector group by operators consistent.)

 Use VectorExpessionWriter to write column vectors into Writables.
 -

 Key: HIVE-4673
 URL: https://issues.apache.org/jira/browse/HIVE-4673
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4673.1.patch


 VectorExpressionWriter interface should be used to write column vectors into 
 Writables. VectorExpressionWriter supports all primitive datatypes and this 
 will make vector select operator and vector group by operators consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4665) error at VectorExecMapper.close in group-by-agg query over ORC, vectorized

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4665:
---

Attachment: HIVE-4665.2.patch

Another patch uploaded to fix the object inspector for VectorUDAFMinMaxString. 
For text we should use WritableStringObjectInspector.

I have verified, that with this change 
select min(stringCol) from table also works.
Without this fix, it would fail.

 error at VectorExecMapper.close in group-by-agg query over ORC, vectorized
 --

 Key: HIVE-4665
 URL: https://issues.apache.org/jira/browse/HIVE-4665
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4665.1.patch, HIVE-4665.2.patch


 CREATE EXTERNAL TABLE FactSqlEngineAM4712( dAppVersionBuild int, 
 dAppVersionBuildUNMAPPED32449 int, dAppVersionMajor int, 
 dAppVersionMinor32447 int, dAverageCols23083 int, dDatabaseSize23090 int, 
 dDate string, dIsInternalMSFT16431 int, dLockEscalationDisabled23323 int, 
 dLockEscalationEnabled23324 int, dMachineID int, dNumberTables23008 int, 
 dNumCompressionPagePartitions23088 int, dNumCompressionRowPartitions23089 
 int, dNumIndexFragmentation23084 int, dNumPartitionedTables23098 int, 
 dNumPartitions23099 int, dNumTablesClusterIndex23010 int, dNumTablesHeap23100 
 int, dSessionType5618 int, dSqlEdition8213 int, dTempDbSize23103 int, 
 mNumColumnStoreIndexesVar48171 bigint, mOccurrences int, mRowFlag int) ROW 
 FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/user/ehans/SQM'; 
 create table FactSqlEngineAM_vec_ORC ROW FORMAT SERDE 
 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' stored as INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' AS select * from 
 FactSqlEngineAM4712;
 hive select ddate, max(dnumbertables23008) from factsqlengineam_vec_orc 
 group by ddate;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 3
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306041757_0016, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306041757_0016
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 3
 2013-06-05 10:03:06,022 Stage-1 map = 0%,  reduce = 0%
 2013-06-05 10:03:51,142 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306041757_0016 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306041757_0016
 Examining task ID: task_201306041757_0016_m_09 (and more) from job 
 job_201306041757_0016
 Task with the most failures(4):
 -
 Task ID:
   task_201306041757_0016_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306041757_0016tipid=task_201306041757_0016_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable 
 cannot be cast to org.apache.hadoop.io.Text
 at 
 

[jira] [Assigned] (HIVE-4599) VectorGroupByOperator steals the non-vectorized children and crashes query if vectorization fails

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4599:
--

Assignee: Jitendra Nath Pandey  (was: Remus Rusanu)

 VectorGroupByOperator steals the non-vectorized children and crashes query if 
 vectorization fails
 -

 Key: HIVE-4599
 URL: https://issues.apache.org/jira/browse/HIVE-4599
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Jitendra Nath Pandey

 Have the VGBy clone it's own row mode children or implement vector mode 
 output (including VectorReduceSinkOperator)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4599) VectorGroupByOperator steals the non-vectorized children and crashes query if vectorization fails

2013-06-06 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4599:
---

Attachment: HIVE-4599.1.patch

Patch uploaded. The non-vectorized children are cloned.

 VectorGroupByOperator steals the non-vectorized children and crashes query if 
 vectorization fails
 -

 Key: HIVE-4599
 URL: https://issues.apache.org/jira/browse/HIVE-4599
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4599.1.patch


 Have the VGBy clone it's own row mode children or implement vector mode 
 output (including VectorReduceSinkOperator)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4685) query using LIKE does not vectorize, then crashes

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678206#comment-13678206
 ] 

Jitendra Nath Pandey commented on HIVE-4685:


 I suspect this is same as HIVE-4599. I have a patch on HIVE-4599, which should 
hopefully fix the issue of query crashing.


 query using LIKE does not vectorize, then crashes
 -

 Key: HIVE-4685
 URL: https://issues.apache.org/jira/browse/HIVE-4685
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson

 The query 
   select count(ddate) from factsqlengineam_vec_orc where ddate like 2013%;
 Starts up but does not run in vectorization mode. Then during non-vectorized 
 execution it crashes.
 Expected result:
 Query runs vectorized and runs successfully.
 Actual result:
 hive select count(ddate) from factsqlengineam_vec_orc where ddate like 
 2013%;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Udf: GenericUDFBridge, is not supported
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306061504_0041, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0041
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0041
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 1
 2013-06-07 10:41:31,544 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:42:01,677 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306061504_0041 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_09 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_02 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_00 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_04 (and more) from job 
 job_201306061504_0041
 Task with the most failures(4):
 -
 Task ID:
   task_201306061504_0041_m_06
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306061504_0041tipid=task_201306061504_0041_m_06
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
 ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
 ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
 at 

[jira] [Commented] (HIVE-4685) query using LIKE does not vectorize

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678410#comment-13678410
 ] 

Jitendra Nath Pandey commented on HIVE-4685:


Those messages are now logged in debug mode.

 query using LIKE does not vectorize
 ---

 Key: HIVE-4685
 URL: https://issues.apache.org/jira/browse/HIVE-4685
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson

 The query 
   select count(ddate) from factsqlengineam_vec_orc where ddate like 2013%;
 Starts up but does not run in vectorization mode. Then during non-vectorized 
 execution it crashes.
 Expected result:
 Query runs vectorized and runs successfully.
 Actual result:
 hive select count(ddate) from factsqlengineam_vec_orc where ddate like 
 2013%;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Udf: GenericUDFBridge, is not supported
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...
 Starting Job = job_201306061504_0041, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0041
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0041
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 
 1
 2013-06-07 10:41:31,544 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:42:01,677 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306061504_0041 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_09 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_02 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_00 (and more) from job 
 job_201306061504_0041
 Examining task ID: task_201306061504_0041_m_04 (and more) from job 
 job_201306061504_0041
 Task with the most failures(4):
 -
 Task ID:
   task_201306061504_0041_m_06
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306061504_0041tipid=task_201306061504_0041_m_06
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
 ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
 at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
 ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 

[jira] [Commented] (HIVE-4599) VectorGroupByOperator steals the non-vectorized children and crashes query if vectorization fails

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678411#comment-13678411
 ] 

Jitendra Nath Pandey commented on HIVE-4599:


Those messages are logged at debug level.

 VectorGroupByOperator steals the non-vectorized children and crashes query if 
 vectorization fails
 -

 Key: HIVE-4599
 URL: https://issues.apache.org/jira/browse/HIVE-4599
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4599.1.patch


 Have the VGBy clone it's own row mode children or implement vector mode 
 output (including VectorReduceSinkOperator)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4678) second clause of AND filter not applied for vectorized execution

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678489#comment-13678489
 ] 

Jitendra Nath Pandey commented on HIVE-4678:


This issue is same as HIVE-4680, will fix it in the same jira.

 second clause of AND filter not applied for vectorized execution
 

 Key: HIVE-4678
 URL: https://issues.apache.org/jira/browse/HIVE-4678
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 Query
 select ddate, dnumbertables23008 from factsqlengineam_vec_orc where ddate = 
 2013-01-08 00:00:00 and dnumbertables23008 = 1052436;
 returns rows where dnumbertables23008 != 1052436.
 Actual results:
 636087 rows 
 Sample:
 ...
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 108
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 1625
 2013-01-08 00:00:00 210
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 209
 2013-01-08 00:00:00 0
 ...
 Expected results:
 Either no rows returned, or all rows have 1052436 in second column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4680) second clause of OR filter not applied in vectorized query execution

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4680:
--

Assignee: Jitendra Nath Pandey

 second clause of OR filter not applied in vectorized query execution
 

 Key: HIVE-4680
 URL: https://issues.apache.org/jira/browse/HIVE-4680
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 query:
 select ddate, count\(\*\) from factsqlengineam_vec_orc where ddate = 
 2012-05-19 00:00:00 OR ddate = 2012-05-20 00:00:00 group by ddate;
 Actual result:
 OK
 2012-05-19 00:00:00 528741
 Expected result:
 There would be two rows, one for each day in the OR clause in the query.
 This query actually returns a row, so there is data there for 2012-05-20.
 select ddate, count\(\*\) from factsqlengineam_vec_orc where ddate = 
 2012-05-20 00:00:00 group by ddate;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4678) second clause of AND filter not applied for vectorized execution

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4678:
--

Assignee: Jitendra Nath Pandey

 second clause of AND filter not applied for vectorized execution
 

 Key: HIVE-4678
 URL: https://issues.apache.org/jira/browse/HIVE-4678
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 Query
 select ddate, dnumbertables23008 from factsqlengineam_vec_orc where ddate = 
 2013-01-08 00:00:00 and dnumbertables23008 = 1052436;
 returns rows where dnumbertables23008 != 1052436.
 Actual results:
 636087 rows 
 Sample:
 ...
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 108
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 1625
 2013-01-08 00:00:00 210
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 209
 2013-01-08 00:00:00 0
 ...
 Expected results:
 Either no rows returned, or all rows have 1052436 in second column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4680) second clause of OR filter not applied in vectorized query execution

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4680.


Resolution: Duplicate

It is same issue as HIVE-4678.

 second clause of OR filter not applied in vectorized query execution
 

 Key: HIVE-4680
 URL: https://issues.apache.org/jira/browse/HIVE-4680
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 query:
 select ddate, count\(\*\) from factsqlengineam_vec_orc where ddate = 
 2012-05-19 00:00:00 OR ddate = 2012-05-20 00:00:00 group by ddate;
 Actual result:
 OK
 2012-05-19 00:00:00 528741
 Expected result:
 There would be two rows, one for each day in the OR clause in the query.
 This query actually returns a row, so there is data there for 2012-05-20.
 select ddate, count\(\*\) from factsqlengineam_vec_orc where ddate = 
 2012-05-20 00:00:00 group by ddate;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4678) second clause of AND, OR filter not applied for vectorized execution

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4678:
---

Summary: second clause of AND, OR filter not applied for vectorized 
execution  (was: second clause of AND filter not applied for vectorized 
execution)

 second clause of AND, OR filter not applied for vectorized execution
 

 Key: HIVE-4678
 URL: https://issues.apache.org/jira/browse/HIVE-4678
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 Query
 select ddate, dnumbertables23008 from factsqlengineam_vec_orc where ddate = 
 2013-01-08 00:00:00 and dnumbertables23008 = 1052436;
 returns rows where dnumbertables23008 != 1052436.
 Actual results:
 636087 rows 
 Sample:
 ...
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 108
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 1625
 2013-01-08 00:00:00 210
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 209
 2013-01-08 00:00:00 0
 ...
 Expected results:
 Either no rows returned, or all rows have 1052436 in second column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4667) tpch query 1 fails with java.lang.ClassCastException

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4667:
--

Assignee: Jitendra Nath Pandey

 tpch query 1 fails with java.lang.ClassCastException
 

 Key: HIVE-4667
 URL: https://issues.apache.org/jira/browse/HIVE-4667
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch


 {noformat}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColSubtractLongScalar.evaluate(DoubleColSubtractLongScalar.java:46)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:69)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColMultiplyDoubleColumn.evaluate(DoubleColMultiplyDoubleColumn.java:41)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFSumDouble.aggregateInputSelection(VectorUDAFSumDouble.java:98)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processAggregators(VectorGroupByOperator.java:174)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:151)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:104)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:717)
   ... 9 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4667) tpch query 1 fails with java.lang.ClassCastException

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4667:
---

Attachment: HIVE-4667.1.patch

 tpch query 1 fails with java.lang.ClassCastException
 

 Key: HIVE-4667
 URL: https://issues.apache.org/jira/browse/HIVE-4667
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4667.1.patch


 {noformat}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColSubtractLongScalar.evaluate(DoubleColSubtractLongScalar.java:46)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:69)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColMultiplyDoubleColumn.evaluate(DoubleColMultiplyDoubleColumn.java:41)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFSumDouble.aggregateInputSelection(VectorUDAFSumDouble.java:98)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processAggregators(VectorGroupByOperator.java:174)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:151)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:104)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:717)
   ... 9 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4667) tpch query 1 fails with java.lang.ClassCastException

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678558#comment-13678558
 ] 

Jitendra Nath Pandey commented on HIVE-4667:


Patch uploaded. The patch includes the fix for HIVE-4678 as well.

 tpch query 1 fails with java.lang.ClassCastException
 

 Key: HIVE-4667
 URL: https://issues.apache.org/jira/browse/HIVE-4667
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4667.1.patch


 {noformat}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColSubtractLongScalar.evaluate(DoubleColSubtractLongScalar.java:46)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:69)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DoubleColMultiplyDoubleColumn.evaluate(DoubleColMultiplyDoubleColumn.java:41)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFSumDouble.aggregateInputSelection(VectorUDAFSumDouble.java:98)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processAggregators(VectorGroupByOperator.java:174)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:151)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:104)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:717)
   ... 9 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4688) NPE in writing null values.

2013-06-07 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4688:
--

 Summary: NPE in writing null values.
 Key: HIVE-4688
 URL: https://issues.apache.org/jira/browse/HIVE-4688
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


VectorExpressionWriter throws NPE when writing null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4668) wrong results for query with modulo (%) in WHERE clause filter

2013-06-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4668.


Resolution: Fixed

 wrong results for query with modulo (%) in WHERE clause filter
 --

 Key: HIVE-4668
 URL: https://issues.apache.org/jira/browse/HIVE-4668
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Sarvesh Sakalanaga

 select disinternalmsft16431, count(disinternalmsft16431) from 
 factsqlengineam_vec_orc where ddate = 2012-12 and ddate  2013-02 and 
 disinternalmsft16431 % 5 = 0 group by disinternalmsft16431
 Expected result:
 0   3160232
 5   33039254
 Actual result:
 0   8697033
 6   2706407 
 5   94709959
 There should be no result row for 6 because 6 % 5 != 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4688) NPE in writing null values.

2013-06-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4688:
---

Attachment: HIVE-4688.1.patch

Patch uploaded. For null values, we should return NullWritable instead of null.

 NPE in writing null values.
 ---

 Key: HIVE-4688
 URL: https://issues.apache.org/jira/browse/HIVE-4688
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4688.1.patch


 VectorExpressionWriter throws NPE when writing null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4678) second clause of AND, OR filter not applied for vectorized execution

2013-06-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4678.


Resolution: Fixed

The fix for this was included in the patch for HIVE-4667.

 second clause of AND, OR filter not applied for vectorized execution
 

 Key: HIVE-4678
 URL: https://issues.apache.org/jira/browse/HIVE-4678
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 Query
 select ddate, dnumbertables23008 from factsqlengineam_vec_orc where ddate = 
 2013-01-08 00:00:00 and dnumbertables23008 = 1052436;
 returns rows where dnumbertables23008 != 1052436.
 Actual results:
 636087 rows 
 Sample:
 ...
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 108
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 1625
 2013-01-08 00:00:00 210
 2013-01-08 00:00:00 0
 2013-01-08 00:00:00 209
 2013-01-08 00:00:00 0
 ...
 Expected results:
 Either no rows returned, or all rows have 1052436 in second column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4688) NPE in writing null values.

2013-06-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4688:
---

Attachment: HIVE-4688.2.patch

 NPE in writing null values.
 ---

 Key: HIVE-4688
 URL: https://issues.apache.org/jira/browse/HIVE-4688
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4688.1.patch, HIVE-4688.2.patch


 VectorExpressionWriter throws NPE when writing null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4688) NPE in writing null values.

2013-06-08 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4688:
---

Attachment: HIVE-4688.3.patch

 NPE in writing null values.
 ---

 Key: HIVE-4688
 URL: https://issues.apache.org/jira/browse/HIVE-4688
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4688.1.patch, HIVE-4688.2.patch, HIVE-4688.3.patch


 VectorExpressionWriter throws NPE when writing null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4695) Unit test failure in TestColumnColumnOperationVectorExpressionEvaluation

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4695:
--

Assignee: Eric Hanson  (was: Jitendra Nath Pandey)

 Unit test failure in TestColumnColumnOperationVectorExpressionEvaluation
 

 Key: HIVE-4695
 URL: https://issues.apache.org/jira/browse/HIVE-4695
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson

 failure message=Output column vector repeating state does not match operand 
 columns expected:lt;truegt; but was:lt;falsegt; 
 type=junit.framework.AssertionFailedErrorjunit.framework.AssertionFailedError:
  Output column vector repeating state does not match operand columns 
 expected:lt;truegt; but was:lt;falsegt;
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.gen.TestColumnColumnOperationVectorExpressionEvaluation.testDoubleColModuloDoubleColumnOutNullsRepeatsC1NullsRepeats(TestColumnColumnOperationVectorExpressionEvaluation.java:5396)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4695) Unit test failure in TestColumnColumnOperationVectorExpressionEvaluation

2013-06-10 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4695:
--

 Summary: Unit test failure in 
TestColumnColumnOperationVectorExpressionEvaluation
 Key: HIVE-4695
 URL: https://issues.apache.org/jira/browse/HIVE-4695
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


failure message=Output column vector repeating state does not match operand 
columns expected:lt;truegt; but was:lt;falsegt; 
type=junit.framework.AssertionFailedErrorjunit.framework.AssertionFailedError:
 Output column vector repeating state does not match operand columns 
expected:lt;truegt; but was:lt;falsegt;
  at org.junit.Assert.fail(Assert.java:93)
  at org.junit.Assert.failNotEquals(Assert.java:647)
  at org.junit.Assert.assertEquals(Assert.java:128)
  at 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.TestColumnColumnOperationVectorExpressionEvaluation.testDoubleColModuloDoubleColumnOutNullsRepeatsC1NullsRepeats(TestColumnColumnOperationVectorExpressionEvaluation.java:5396)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
  at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
  at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
  at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
  at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
  at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
  at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
  at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
  at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
  at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
  at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4694) Fix ORC TestVectorizedORCReader testcase for Timestamps

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4694:
---

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-4160

 Fix ORC TestVectorizedORCReader testcase for Timestamps
 ---

 Key: HIVE-4694
 URL: https://issues.apache.org/jira/browse/HIVE-4694
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Affects Versions: vectorization-branch
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: vectorization-branch

 Attachments: HIVE-4694.patch


 ORC vectorized tests were not testing for timestamps correctly.
 java.sql.Timestamp is a confusing API, because of the mix of getTime()  
 getNanos() usage. Though it might look like they return independent values, 
 getTime() includes part of the value already present in getNanos().
 Please view the implementation code for the confusion
 http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/sql/Timestamp.java#Timestamp.getTime%28%29
 Fix in HIVE-4681 caused test-failures, which needs the test to be fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4702) Unit test failure TestVectorSelectOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4702:
--

Assignee: Jitendra Nath Pandey

 Unit test failure TestVectorSelectOperator
 --

 Key: HIVE-4702
 URL: https://issues.apache.org/jira/browse/HIVE-4702
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey

 TestCase TestVectorSelectOperator
 Name
 Status
 Type
 Time(s)
 testSelectOperator Error N/A 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator$ValidatorVectorSelectOperator.forward(TestVectorSelectOperator.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:124)
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator.testSelectOperator(TestVectorSelectOperator.java:87)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4702) Unit test failure TestVectorSelectOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4702:
---

Attachment: HIVE-4702.1.patch

 Unit test failure TestVectorSelectOperator
 --

 Key: HIVE-4702
 URL: https://issues.apache.org/jira/browse/HIVE-4702
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4702.1.patch


 TestCase TestVectorSelectOperator
 Name
 Status
 Type
 Time(s)
 testSelectOperator Error N/A 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator$ValidatorVectorSelectOperator.forward(TestVectorSelectOperator.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:124)
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator.testSelectOperator(TestVectorSelectOperator.java:87)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4702) Unit test failure TestVectorSelectOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4702:
---

Status: Patch Available  (was: Open)

 Unit test failure TestVectorSelectOperator
 --

 Key: HIVE-4702
 URL: https://issues.apache.org/jira/browse/HIVE-4702
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4702.1.patch


 TestCase TestVectorSelectOperator
 Name
 Status
 Type
 Time(s)
 testSelectOperator Error N/A 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator$ValidatorVectorSelectOperator.forward(TestVectorSelectOperator.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:124)
   at 
 org.apache.hadoop.hive.ql.exec.vector.TestVectorSelectOperator.testSelectOperator(TestVectorSelectOperator.java:87)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4596) Fix serialization exceptions in VectorGroupByOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HIVE-4596.


  Resolution: Fixed
Release Note: This was fixed with HIVE-4599. The exception was happening 
because the non-vector operators were not being cloned appropriately and lead 
to corrupting the original tree.

 Fix serialization exceptions in VectorGroupByOperator
 -

 Key: HIVE-4596
 URL: https://issues.apache.org/jira/browse/HIVE-4596
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor

 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4596) Fix serialization exceptions in VectorGroupByOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4596:
---

Release Note:   (was: This was fixed with HIVE-4599. The exception was 
happening because the non-vector operators were not being cloned appropriately 
and lead to corrupting the original tree.)

 Fix serialization exceptions in VectorGroupByOperator
 -

 Key: HIVE-4596
 URL: https://issues.apache.org/jira/browse/HIVE-4596
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor

 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4596) Fix serialization exceptions in VectorGroupByOperator

2013-06-10 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680226#comment-13680226
 ] 

Jitendra Nath Pandey commented on HIVE-4596:



This was fixed with HIVE-4599. The exception was happening because the 
non-vector operators were not being cloned appropriately and lead to corrupting 
the original tree.


 Fix serialization exceptions in VectorGroupByOperator
 -

 Key: HIVE-4596
 URL: https://issues.apache.org/jira/browse/HIVE-4596
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor

 Going down the vectorization path
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator
 Continuing ...
 java.lang.Exception: XMLEncoder: discarding statement 
 ArrayList.add(VectorGroupByOperator);
 Continuing ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4714) Vectorized Sum of scalar subtract column returns negative result when positive exected

2013-06-11 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4714:
--

Assignee: Jitendra Nath Pandey

 Vectorized Sum of scalar subtract column returns negative result when 
 positive exected
 --

 Key: HIVE-4714
 URL: https://issues.apache.org/jira/browse/HIVE-4714
 Project: Hive
  Issue Type: Sub-task
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Attachments: sum_data.zip


 Actual: -5701157.669591231
 Expected: 5701157.663489044
 {noformat}
 drop table LINEITEM_ORC;
 create external table LINEITEM_ORC(L_DISCOUNT float ) 
 ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
 STORED AS
 INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat'
 OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';
 {noformat}
 {noformat}
 SELECT Sum(1 - l_discount) FROM   Lineitem_orc
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4714) Vectorized Sum of scalar subtract column returns negative result when positive exected

2013-06-12 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4714:
---

Status: Patch Available  (was: Open)

 Vectorized Sum of scalar subtract column returns negative result when 
 positive exected
 --

 Key: HIVE-4714
 URL: https://issues.apache.org/jira/browse/HIVE-4714
 Project: Hive
  Issue Type: Sub-task
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4714.1.patch, sum_data.zip


 Actual: -5701157.669591231
 Expected: 5701157.663489044
 {noformat}
 drop table LINEITEM_ORC;
 create external table LINEITEM_ORC(L_DISCOUNT float ) 
 ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
 STORED AS
 INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.CommonOrcInputFormat'
 OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';
 {noformat}
 {noformat}
 SELECT Sum(1 - l_discount) FROM   Lineitem_orc
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4722) MIN on timestamp column gives incorrect result.

2013-06-12 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4722:
--

 Summary: MIN on timestamp column gives incorrect result.
 Key: HIVE-4722
 URL: https://issues.apache.org/jira/browse/HIVE-4722
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Gopal V


MIN on timestamp column gives incorrect result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4718) array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 column GROUP BY

2013-06-12 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4718:
--

Assignee: Remus Rusanu

 array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 
 column GROUP BY
 -

 Key: HIVE-4718
 URL: https://issues.apache.org/jira/browse/HIVE-4718
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Remus Rusanu

 select ddate, disinternalmsft16431, count\(\*\) from factsqlengineam_vec_orc 
 where (ddate = '2012-05-19 00:00:00' or ddate = '2012-05-20 00:00:00') and 
 (disinternalmsft16431 = 0 or disinternalmsft16431 = 5)
 group by ddate, disinternalmsft16431;
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapper.getBytes(VectorHashKeyWrapper.java:226)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:528)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flush(VectorGroupByOperator.java:293)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:423)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196)
 ... 8 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4716) Classcast exception with two group by keys of types string and tinyint.

2013-06-12 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4716:
--

Assignee: Remus Rusanu

 Classcast exception with two group by keys of types string and tinyint.
 ---

 Key: HIVE-4716
 URL: https://issues.apache.org/jira/browse/HIVE-4716
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Remus Rusanu

 Query:
select t,sum(i),s from orcsmall where s  aaa group by t, s;
 t : tinyint
 i : int
 s : string
 Exception:
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing 
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.map(VectorExecMapper.java:164)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing 
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:752)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.map(VectorExecMapper.java:146)
 ... 4 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:151)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:145)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:120)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:91)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:745)
 ... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4744:
--

Assignee: Jitendra Nath Pandey

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
  

[jira] [Assigned] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoo

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4745:
--

Assignee: Remus Rusanu

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Remus Rusanu
 Fix For: vectorization-branch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flush(VectorGroupByOperator.java:281)
   at 
 

[jira] [Created] (HIVE-4754) OrcInputFormat should be enhanced to provide vectorized input.

2013-06-18 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4754:
--

 Summary: OrcInputFormat should be enhanced to provide vectorized 
input.
 Key: HIVE-4754
 URL: https://issues.apache.org/jira/browse/HIVE-4754
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


This change will make CommonOrcInputFormat redundant. The OrcInputFormat will 
again become the default input format for Orc files. The reason for this change 
is to allow existing orc files to work with vectorized code path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4718) array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 column GROUP BY

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687477#comment-13687477
 ] 

Jitendra Nath Pandey commented on HIVE-4718:


I found another instance of this issue:


Query : select s, i, max(b) from orctabwithnulls group by s, i;

Table:
   CREATE  TABLE orctabwithnulls (
  t tinyint, 
  si smallint, 
  i int , 
  b bigint , 
  f float , 
  d double , 
  bo boolean , 
  s string ) STORED AS ORC

Exception:
  
java.lang.RuntimeException: Hive Runtime Error while closing operators
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapper.getBytes(VectorHashKeyWrapper.java:226)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:528)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flush(VectorGroupByOperator.java:293)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:423)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196)
... 8 more


 array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 
 column GROUP BY
 -

 Key: HIVE-4718
 URL: https://issues.apache.org/jira/browse/HIVE-4718
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Remus Rusanu

 select ddate, disinternalmsft16431, count\(\*\) from factsqlengineam_vec_orc 
 where (ddate = '2012-05-19 00:00:00' or ddate = '2012-05-20 00:00:00') and 
 (disinternalmsft16431 = 0 or disinternalmsft16431 = 5)
 group by ddate, disinternalmsft16431;
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapper.getBytes(VectorHashKeyWrapper.java:226)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:528)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flush(VectorGroupByOperator.java:293)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:423)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196)
 ... 8 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask

--
This message is automatically generated by JIRA.
If you think it was sent 

[jira] [Reopened] (HIVE-4718) array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 column GROUP BY

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reopened HIVE-4718:



 array out of bounds exception near VectorHashKeyWrapper.getBytes() with 2 
 column GROUP BY
 -

 Key: HIVE-4718
 URL: https://issues.apache.org/jira/browse/HIVE-4718
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Remus Rusanu

 select ddate, disinternalmsft16431, count\(\*\) from factsqlengineam_vec_orc 
 where (ddate = '2012-05-19 00:00:00' or ddate = '2012-05-20 00:00:00') and 
 (disinternalmsft16431 = 0 or disinternalmsft16431 = 5)
 group by ddate, disinternalmsft16431;
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
 at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapper.getBytes(VectorHashKeyWrapper.java:226)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.getWritableKeyValue(VectorHashKeyWrapperBatch.java:528)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.flush(VectorGroupByOperator.java:293)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.closeOp(VectorGroupByOperator.java:423)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:196)
 ... 8 more
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4754) OrcInputFormat should be enhanced to provide vectorized input.

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4754:
---

Attachment: HIVE-4754.1.patch

 OrcInputFormat should be enhanced to provide vectorized input.
 --

 Key: HIVE-4754
 URL: https://issues.apache.org/jira/browse/HIVE-4754
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4754.1.patch


 This change will make CommonOrcInputFormat redundant. The OrcInputFormat will 
 again become the default input format for Orc files. The reason for this 
 change is to allow existing orc files to work with vectorized code path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4744:
---

Status: Patch Available  (was: Open)

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4744.1.patch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at 

[jira] [Updated] (HIVE-4754) OrcInputFormat should be enhanced to provide vectorized input.

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4754:
---

Status: Patch Available  (was: Open)

 OrcInputFormat should be enhanced to provide vectorized input.
 --

 Key: HIVE-4754
 URL: https://issues.apache.org/jira/browse/HIVE-4754
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4754.1.patch


 This change will make CommonOrcInputFormat redundant. The OrcInputFormat will 
 again become the default input format for Orc files. The reason for this 
 change is to allow existing orc files to work with vectorized code path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4744:
---

Attachment: HIVE-4744.1.patch

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4744.1.patch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at 

[jira] [Updated] (HIVE-4758) NULLs and record separators broken with vectorization branch intermediate outputs

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4758:
---

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-4160

 NULLs and record separators broken with vectorization branch intermediate 
 outputs
 -

 Key: HIVE-4758
 URL: https://issues.apache.org/jira/browse/HIVE-4758
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4758-001.patch


 Queries of type timestamp on partitioned tables return NULL for all rows of 
 timestamp columns, if the first row in the column is NULL.
 This was tracked down to the failure of timestamp columns to parse the map 
 output properly, which was due to differing format from the unvectorized 
 code's output.
 The output file for vectorized code says 
 {code}
 (null)^A
 2013-02-12 21:05:29^A
 {code}
 Where the unvectorized code outputs
 {code}
 \N
 2013-02-12 21:05:29
 {code}
 The vectorized code passes on the (null) string to the LazyTimestamp 
 parser, which fails to parse it  returns NULL, but slowed down massively 
 by the IllegalArgumentException.
 And the extraneous ^A prevents the actual Timestamp from being parsed into 
 valid timestamps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4744:
---

Attachment: HIVE-4744.2.patch

Updated patch with unit tests.

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4744.1.patch, HIVE-4744.2.patch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 

[jira] [Updated] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4744:
---

Attachment: HIVE-4744.3.patch

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4744.1.patch, HIVE-4744.2.patch, HIVE-4744.3.patch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 

[jira] [Commented] (HIVE-4744) Unary Minus Expression Throwing java.lang.NullPointerException

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688550#comment-13688550
 ] 

Jitendra Nath Pandey commented on HIVE-4744:


Updated patch with removed commented code.

 Unary Minus Expression Throwing java.lang.NullPointerException
 --

 Key: HIVE-4744
 URL: https://issues.apache.org/jira/browse/HIVE-4744
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4744.1.patch, HIVE-4744.2.patch, HIVE-4744.3.patch


 {noformat}
 SELECT   L_QUANTITY,
  L_RETURNFLAG,
  (L_QUANTITY * -2),
  (L_QUANTITY % L_SUPPKEY),
  (-(L_TAX))
 FROM lineitem_orc
 WHERE((L_QUANTITY  L_TAX)
   OR (L_TAX  L_ORDERKEY))
 ORDER BY L_QUANTITY;
 {noformat}
 Executed over tcpch lineitem generated at a scale factor of 1gb
 {noformat}
 13/06/15 03:27:21 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_4280@SLAVE23-WIN_201306150327_1960387810.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getUnaryMinusExpression(VectorizationContext.java:327)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:440)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:397)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:248)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.initializeOp(VectorSelectOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.initializeOp(VectorFilterOperator.java:76)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:187)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorOperator(ExecDriver.java:580)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.validateVectorPath(ExecDriver.java:568)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:145)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 

[jira] [Updated] (HIVE-4754) OrcInputFormat should be enhanced to provide vectorized input.

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4754:
---

Attachment: HIVE-4754.2.patch

 OrcInputFormat should be enhanced to provide vectorized input.
 --

 Key: HIVE-4754
 URL: https://issues.apache.org/jira/browse/HIVE-4754
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4754.1.patch, HIVE-4754.2.patch


 This change will make CommonOrcInputFormat redundant. The OrcInputFormat will 
 again become the default input format for Orc files. The reason for this 
 change is to allow existing orc files to work with vectorized code path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4754) OrcInputFormat should be enhanced to provide vectorized input.

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688574#comment-13688574
 ] 

Jitendra Nath Pandey commented on HIVE-4754:


Removed CommonOrcInputFormat in the latest patch.

 OrcInputFormat should be enhanced to provide vectorized input.
 --

 Key: HIVE-4754
 URL: https://issues.apache.org/jira/browse/HIVE-4754
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4754.1.patch, HIVE-4754.2.patch


 This change will make CommonOrcInputFormat redundant. The OrcInputFormat will 
 again become the default input format for Orc files. The reason for this 
 change is to allow existing orc files to work with vectorized code path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4745:
---

Attachment: HIVE-4745.2.patch

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4745.2.patch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 

[jira] [Updated] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4745:
---

Attachment: (was: HIVE-4754.2.patch)

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4745.2.patch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 

[jira] [Assigned] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoo

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4745:
--

Assignee: Jitendra Nath Pandey  (was: Remus Rusanu)

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4745.2.patch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 

[jira] [Updated] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4745:
---

Attachment: HIVE-4754.2.patch

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4745.2.patch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 

[jira] [Commented] (HIVE-4745) java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hado

2013-06-19 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688720#comment-13688720
 ] 

Jitendra Nath Pandey commented on HIVE-4745:


This patch effectively reverts the HIVE-4688 change. The NPE is fixed in 
VectorizedRowBatch by HIVE-4758.

 java.lang.RuntimeException: Hive Runtime Error while closing operators: 
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
 -

 Key: HIVE-4745
 URL: https://issues.apache.org/jira/browse/HIVE-4745
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4745.2.patch


 {noformat}
 SELECT SUM(L_QUANTITY),
(SUM(L_QUANTITY) + -1.3000E+000),
(-2.2002E+000 % (SUM(L_QUANTITY) + 
 -1.3000E+000)),
MIN(L_EXTENDEDPRICE)
 FROM   lineitem_orc
 WHERE  ((L_EXTENDEDPRICE = L_LINENUMBER)
 OR (L_TAX  L_EXTENDEDPRICE));
 {noformat}
 executed over tpch line item with scale factor 1gb
 {noformat}
 13/06/15 11:19:17 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Logging initialized using configuration in 
 file:/C:/Hadoop/hive-0.9.0/conf/hive-log4j.properties
 Hive history 
 file=c:\hadoop\hive-0.9.0\logs\history/hive_job_log_jenkinsuser_5292@SLAVE23-WIN_201306151119_1652846565.txt
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapred.reduce.tasks=number
 Starting Job = job_201306142329_0098, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306142329_0098
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2013-06-15 11:19:47,490 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:29,801 Stage-1 map = 76%,  reduce = 0%
 2013-06-15 11:20:32,849 Stage-1 map = 0%,  reduce = 0%
 2013-06-15 11:20:35,880 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201306142329_0098 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306142329_0098
 Examining task ID: task_201306142329_0098_m_02 (and more) from job 
 job_201306142329_0098
 Task with the most failures(4): 
 -
 Task ID:
   task_201306142329_0098_m_00
 URL:
   
 http://localhost:50030/taskdetails.jsp?jobid=job_201306142329_0098tipid=task_201306142329_0098_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExecMapper.close(VectorExecMapper.java:229)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.NullWritable 
 cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableDoubleObjectInspector.get(WritableDoubleObjectInspector.java:35)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:340)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:257)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:245)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at 

[jira] [Commented] (HIVE-4770) java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row

2013-06-21 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690562#comment-13690562
 ] 

Jitendra Nath Pandey commented on HIVE-4770:


 From the exception trace, it seems that the query didn't go on the 
vectorization code path. I think it is because the LIKE expression support is 
still not committed. 
 Does the query fail when vectorization is disabled too? If not then we have a 
bug in validation for vectorization.

 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row 
 --

 Key: HIVE-4770
 URL: https://issues.apache.org/jira/browse/HIVE-4770
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Tony Murphy
 Fix For: vectorization-branch

 Attachments: output.txt, tableAndData.zip


 Table and data attached.
 {noformat}
 SELECT   cfloat,
  csmallint,
  cint,
  ctimestamp,
  (cfloat + 10),
  STDDEV_SAMP(cfloat),
  (-((cfloat + 10))),
  (cint / cfloat),
  MAX(cint),
  (-(cint)),
  (cint * STDDEV_SAMP(cfloat)),
  STDDEV_SAMP(cint),
  VAR_SAMP(cint),
  (-(MAX(cint))),
  ((-(MAX(cint))) / 0.E+000)
 FROM alltypes_orc
 WHERE(((1 = cfloat)
OR (cstring2 LIKE '%b'))
   OR ((cint = csmallint)
   OR (cstring2 LIKE '%ss')))
 GROUP BY cfloat, csmallint, cint, ctimestamp
 ORDER BY cint, cfloat;
 {noformat}
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row 
 {ctinyint:null,csmallint:-3806,cint:-66533315,cbigint:null,cdouble:null,cfloat:152.95706,cstring1:null,cstring2:null,ctimestamp:9131-01-01
  16:52:03.53,cboolean:null}
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {ctinyint:null,csmallint:-3806,cint:-66533315,cbigint:null,cdouble:null,cfloat:152.95706,cstring1:null,cstring2:null,ctimestamp:9131-01-01
  16:52:03.53,cboolean:null}
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:796)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:136)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:502)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:832)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:652)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.shouldBeFlushed(GroupByOperator.java:941)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:836)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:723)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:791)
   ... 21 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your 

[jira] [Commented] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-01 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697207#comment-13697207
 ] 

Jitendra Nath Pandey commented on HIVE-4684:


 All the expressions generated in getVectorBinaryComparisonFilterExpression are 
filter expressions. We don't need to check for the opType in this method. The 
boolean expressions outside the 'where clause' e.g. in projections are not 
being handled right now. That should be addressed separately in a different 
jira.

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4684.0.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4160) Vectorized Query Execution in Hive

2013-07-11 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706089#comment-13706089
 ] 

Jitendra Nath Pandey commented on HIVE-4160:


Dmitry, Vinod
  There is significant amount of vectorization work in expression evaluation 
for example, arithmetic expressions or logical expressions or aggregations etc. 
Many of these expressions are pretty generic and different systems are likely 
to have similar semantics for these. It should be possible to re-use this code 
with little change in pig or other systems. It will be required to use same 
vectorized representation of data in the processing engine to re-use these 
expressions, but that part of code is also generic and re-usable. I think that 
could be a good starting point.
  However, a bunch of the vectorization work is in operator code where we have 
vectorized version of the hive operators. These operators are closely tied with 
hive semantics and implementation. Therefore, it will need some restructuring 
in hive code base as well to generalize these operators for re-use in other 
projects. Also, at this point we should be thinking more generally about a 
common physical layer shared between pig and hive. These languages can continue 
to have different logical plans but it would be desirable that they share 
common physical plan structure because they both use same map-reduce runtime.

 Vectorized Query Execution in Hive
 --

 Key: HIVE-4160
 URL: https://issues.apache.org/jira/browse/HIVE-4160
 Project: Hive
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: Hive-Vectorized-Query-Execution-Design.docx, 
 Hive-Vectorized-Query-Execution-Design-rev2.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.docx, 
 Hive-Vectorized-Query-Execution-Design-rev3.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev4.docx, 
 Hive-Vectorized-Query-Execution-Design-rev4.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev5.docx, 
 Hive-Vectorized-Query-Execution-Design-rev5.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev6.docx, 
 Hive-Vectorized-Query-Execution-Design-rev6.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev7.docx, 
 Hive-Vectorized-Query-Execution-Design-rev8.docx, 
 Hive-Vectorized-Query-Execution-Design-rev8.pdf, 
 Hive-Vectorized-Query-Execution-Design-rev9.docx, 
 Hive-Vectorized-Query-Execution-Design-rev9.pdf


 The Hive query execution engine currently processes one row at a time. A 
 single row of data goes through all the operators before the next row can be 
 processed. This mode of processing is very inefficient in terms of CPU usage. 
 Research has demonstrated that this yields very low instructions per cycle 
 [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization 
 and data columns go through a layer of object inspectors that identify column 
 type, deserialize data and determine appropriate expression routines in the 
 inner loop. These layers of virtual method calls further slow down the 
 processing. 
 This work will add support for vectorized query execution to Hive, where, 
 instead of individual rows, batches of about a thousand rows at a time are 
 processed. Each column in the batch is represented as a vector of a primitive 
 data type. The inner loop of execution scans these vectors very fast, 
 avoiding method calls, deserialization, unnecessary if-then-else, etc. This 
 substantially reduces CPU time used, and gives excellent instructions per 
 cycle (i.e. improved processor pipeline utilization). See the attached design 
 specification for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4859) String column comparison classes should be renamed.

2013-07-15 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4859:
--

 Summary: String column comparison classes should be renamed.
 Key: HIVE-4859
 URL: https://issues.apache.org/jira/browse/HIVE-4859
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


FilterStringColEqualStringCol should be renamed to 
FilterStringColEqualStringColumn. Similarly, all string comparison classes 
should be renamed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4859) String column comparison classes should be renamed.

2013-07-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4859:
---

Attachment: HIVE-4859.1.patch

 String column comparison classes should be renamed.
 ---

 Key: HIVE-4859
 URL: https://issues.apache.org/jira/browse/HIVE-4859
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4859.1.patch


 FilterStringColEqualStringCol should be renamed to 
 FilterStringColEqualStringColumn. Similarly, all string comparison classes 
 should be renamed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4859) String column comparison classes should be renamed.

2013-07-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4859:
---

Status: Patch Available  (was: Open)

 String column comparison classes should be renamed.
 ---

 Key: HIVE-4859
 URL: https://issues.apache.org/jira/browse/HIVE-4859
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4859.1.patch


 FilterStringColEqualStringCol should be renamed to 
 FilterStringColEqualStringColumn. Similarly, all string comparison classes 
 should be renamed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4859) String column comparison classes should be renamed.

2013-07-15 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708996#comment-13708996
 ] 

Jitendra Nath Pandey commented on HIVE-4859:


Patch uploaded.
https://reviews.apache.org/r/12560/

 String column comparison classes should be renamed.
 ---

 Key: HIVE-4859
 URL: https://issues.apache.org/jira/browse/HIVE-4859
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4859.1.patch


 FilterStringColEqualStringCol should be renamed to 
 FilterStringColEqualStringColumn. Similarly, all string comparison classes 
 should be renamed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4684:
---

Attachment: HIVE-4684.1.patch

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Sarvesh Sakalanaga
 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HIVE-4684:
--

Assignee: Jitendra Nath Pandey  (was: Sarvesh Sakalanaga)

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch, 
 HIVE-4684.2.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4684:
---

Attachment: HIVE-4684.2.patch

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch, 
 HIVE-4684.2.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13712999#comment-13712999
 ] 

Jitendra Nath Pandey commented on HIVE-4684:


There are two issues here :
1) If the left expression is constant and right expression is a generic 
function, the query doesn't vectorize because corresponding vector expressions 
are missing.
2) If the left expression is constant and right is a column expression, the 
query vectorizes to an incorrect expression with column on left, which won't 
work for non-commutative expressions.

The latest patch includes the missing expressions that addresses (1) and also a 
one line fix in VectorizationContext that fixes (2).


 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch, 
 HIVE-4684.2.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize

2013-07-22 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4684:
---

Attachment: HIVE-4684.3.patch

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch, 
 HIVE-4684.2.patch, HIVE-4684.3.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4822) implement vectorized math functions

2013-07-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718932#comment-13718932
 ] 

Jitendra Nath Pandey commented on HIVE-4822:


bq. How does explain work with the vectorization engine?
  The 'explain' continues to work as before and returns the same plan as in 
non-vector mode.
Vectorization executes exactly the same query plan, only the implementation of 
the operators and expressions has been changed to run in vectorized fashion.
  However, we do plan to enhance 'explain' to also show which operators will be 
executed in vectorized mode. We will start working on it very soon and file a 
jira.

  In current implementation, we don't need the 'explain' annotations on 
vectorized UDFs, because the vectorized UDFs are used at run time. In the query 
planning stage only row mode UDFs are used, however at query execution time if 
vectorization is possible, we switch to corresponding vectorized UDFs. We 
adopted this approach to avoid any changes to query planner for vectorization.

bq. Could we somehow hybrid some of our existing UDFS to work from both engines?
  We will surely have to support the hybrid approach, as you are suggesting, 
for UDFs that users have implemented, even though we will recommend users to 
re-implement their UDFs in vectorized fashion. However, for built in hive UDFs, 
it will almost always be better to have vectorized implementation for 
performance. Eventually, we do want to have vectorized implementation for all 
built-in UDFs.

bq. Are we sure that functions that operate on doubles and floats are going to 
round exactly the same way? 
  We have used same underlying java libraries therefore, our results should 
match. In our testing we do compare the results with non-vector results to make 
sure.

bq. Do we have a wiki page or something where we are keeping track of what is 
currently supported using vectorization?
  That's a good idea, I agree we should track this so that community is aware. 
It will also help and encourage folks to identify areas to contribute.


 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, 
 HIVE-4822.5-vectorization.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   6   7   8   >