[jira] [Updated] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-28 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4975:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and branch. Thanks [~kevinwilfong]!

 Reading orc file throws exception after adding new column
 -

 Key: HIVE-4975
 URL: https://issues.apache.org/jira/browse/HIVE-4975
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0
 Environment: hive 0.11.0 hadoop 1.0.0
Reporter: cyril liao
Assignee: Kevin Wilfong
Priority: Critical
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-4975.1.patch.txt, HIVE-4975.2.patch


 ORC file read failure after add table column.
 create a table which have three column .(a string,b string,c string).
 add a new column after c by executing ALTER TABLE table ADD COLUMNS (d 
 string).
 execute hiveql select d from table,the following exception goes:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row [Error getting row data with 
 exception java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row [Error getting row data with exception 
 java.lang.ArrayIndexOutOfBoundsException: 4
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
  ]
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 d
  

[jira] [Commented] (HIVE-6771) Update WebHCat E2E tests now that comments is reported correctly in describe table output

2014-03-28 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950424#comment-13950424
 ] 

Ashutosh Chauhan commented on HIVE-6771:


+1

 Update WebHCat E2E tests now that comments is reported correctly in describe 
 table output
 ---

 Key: HIVE-6771
 URL: https://issues.apache.org/jira/browse/HIVE-6771
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.13.0

 Attachments: HIVE-6771.patch


 HIVE-6681 corrected the comments in the describe table output, earlier it 
 would show from deserializer in comments.
 Some WebHCat E2E tests are checking for the string from deserializer even 
 overshadowing the actual comments.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6447) Bucket map joins in hive-tez

2014-03-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6447:
-

   Resolution: Fixed
Fix Version/s: 0.14.0
   0.13.0
   Status: Resolved  (was: Patch Available)

 Bucket map joins in hive-tez
 

 Key: HIVE-6447
 URL: https://issues.apache.org/jira/browse/HIVE-6447
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: tez-branch
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, 
 HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, 
 HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, 
 HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, 
 HIVE-6447.WIP.patch


 Support bucket map joins in tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6447) Bucket map joins in hive-tez

2014-03-28 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950427#comment-13950427
 ] 

Vikram Dixit K commented on HIVE-6447:
--

Committed to trunk and branch-0.13. Thanks for the reviews [~sseth], 
[~hagleitn], [~rhbutani]. Thanks [~alangates] [~thejas] for the test runs.

 Bucket map joins in hive-tez
 

 Key: HIVE-6447
 URL: https://issues.apache.org/jira/browse/HIVE-6447
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: tez-branch
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, 
 HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, 
 HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, 
 HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, 
 HIVE-6447.WIP.patch


 Support bucket map joins in tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view

2014-03-28 Thread Adrian Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Wang updated HIVE-6765:
--

Component/s: (was: Query Processor)

 ASTNodeOrigin unserializable leads to fail when join with view
 --

 Key: HIVE-6765
 URL: https://issues.apache.org/jira/browse/HIVE-6765
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Adrian Wang
 Attachments: HIVE-6765.patch.1


 when a view contains a UDF, and the view comes into a JOIN operation, Hive 
 will encounter a bug with stack trace like
 Caused by: java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
   at java.lang.Class.newInstance0(Class.java:359)
   at java.lang.Class.newInstance(Class.java:327)
   at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:616)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950531#comment-13950531
 ] 

Justin Coffey commented on HIVE-6757:
-

Owen, the solution your proposing means that there is no seamless upgrade path 
for existing parquet-hive users and that somewhere on the hive wiki there will 
have to be a call out attention existing parquet users, you must include the 
parquet-hive.jar when upgrading to hive 13.  we're sorry, but this is the price 
you have to pay for being an early adopter and driving functionality.

One of the goals of the #HIVE-5783 patch was to make the lives of parquet users 
easier (there were of course many other reasons, but ease of use is a good goal 
in and of itself).  The classes as they are do no harm and it's hard to see how 
they pollute the code base of Hive in any significant way.  This patch kinda 
sorta seems a tiny bit punitive if you ask me.

Please don't take any of this the wrong way, but I believe this is what a fair 
chunk of the parquet-hive community might think if this patch is committed.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950853#comment-13950853
 ] 

Brock Noland commented on HIVE-6757:


Great points Justin. Many folks in the Hive community want this code, which is 
not against any Apache or Hive policy.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950911#comment-13950911
 ] 

Owen O'Malley commented on HIVE-6757:
-

Justin,
   They already have parquet-hive.jar that they've manually added to their 
installation. Giving them an upgraded jar to work with Hive 0.13 is a better 
answer than making conflicting classes in Hive itself.

In fact, the way that HIVE-5783 was done imposes a significant chance that 
class conflicts will occur for users that have manually installed the parquet 
jars. I'm not trying to force reverting HIVE-5783 out of Hive 0.13, but leaving 
these classes in the parquet jars and not in Hive is a much better answer. 

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6767) Golden file updates for hadoop-2

2014-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6767:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 Golden file updates for hadoop-2
 

 Key: HIVE-6767
 URL: https://issues.apache.org/jira/browse/HIVE-6767
 Project: Hive
  Issue Type: Task
  Components: Tests
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6767.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950921#comment-13950921
 ] 

Xuefu Zhang commented on HIVE-6757:
---

If removing the code helps Hive functionally or performance-widely, I may be 
convinced by the proposal of removing this small piece of code. Based on what 
we gain by doing this removal, it's hard to be convincing that this benefits 
anything if at all, while discouraging some hive/parquet users who really care. 
For most of other Hive users, who cares about the extra two classes they don't 
need to bother with.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951057#comment-13951057
 ] 

Eric Hanson commented on HIVE-6633:
---

[~thejas] Can you commit this to 0.13 please?

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6772) Virtual columns when used with Lateral View Explode results in SemanticException [Error 10004]

2014-03-28 Thread Steve Ogden (JIRA)
Steve Ogden created HIVE-6772:
-

 Summary: Virtual columns when used with Lateral View Explode 
results in SemanticException [Error 10004]
 Key: HIVE-6772
 URL: https://issues.apache.org/jira/browse/HIVE-6772
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0
 Environment: Red Hat Enterprise Linux Server release 6.3 (Santiago)
Hadoop 2.0.0-cdh4.1.2
Hive 0.9.0
Reporter: Steve Ogden
Priority: Minor


When using the virtual columns with 'lateral view explode', I get the following 
error:

FAILED: SemanticException [Error 10004]: Line 3:22 Invalid table alias or 
column reference 'INPUT__FILE__NAME': (possible column names are: _col0, _col1, 
_col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, 
_col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22)

Here is the query:
select
  newMd5(concat(INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE)) ukey,
  flat_ric_cd as ric_cd
from edwpoc.ts_rtd_gs_stg
lateral view explode(split(ric_cd,',')) subView as flat_ric_cd




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950996#comment-13950996
 ] 

Justin Coffey commented on HIVE-6757:
-

I guess my point is simply that early adopters are penalized for life whereas 
new users get the full benefit of the patch.  I agree that the penalty is 
pretty small, but the two classes kicking around in the parquet package are 
even less of a penalty to the hive code base.  Thus I remain against pulling 
them out.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6771) Update WebHCat E2E tests now that comments is reported correctly in describe table output

2014-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6771:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk. Thanks, Deepesh!

 Update WebHCat E2E tests now that comments is reported correctly in describe 
 table output
 ---

 Key: HIVE-6771
 URL: https://issues.apache.org/jira/browse/HIVE-6771
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.13.0

 Attachments: HIVE-6771.patch


 HIVE-6681 corrected the comments in the describe table output, earlier it 
 would show from deserializer in comments.
 Some WebHCat E2E tests are checking for the string from deserializer even 
 overshadowing the actual comments.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Open  (was: Patch Available)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Patch Available  (was: Open)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19789: HIVE-6739 Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits

2014-03-28 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19789/
---

Review request for hive, Gunther Hagleitner and Vikram Dixit Kumaraswamy.


Repository: hive-git


Description
---

See jira


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java c247030 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 
720b8d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 5f0f353 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 5dd8f98 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java fdbd996 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 38c4c11 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java e1cc3f4 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TezWork.java f974c57 
  ql/src/java/org/apache/hadoop/hive/ql/plan/UnionWork.java 60781e6 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 78f1a8f 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java 
d2c332c 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java 
5ad4250 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 859b5ad 

Diff: https://reviews.apache.org/r/19789/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Resolved] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-6758.
---

Resolution: Cannot Reproduce

Closed as unreproducible. Feel free to reopen it if repo case can be provided.

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0, 0.12.0
 Environment: CDH4.5
Reporter: Johndee Burks

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Attachment: HIVE-6752.3.patch

Patch updated with a small bug fix identified in testing.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6734) DDL locking too course grained in new db txn manager

2014-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6734:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 DDL locking too course grained in new db txn manager
 

 Key: HIVE-6734
 URL: https://issues.apache.org/jira/browse/HIVE-6734
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6734.patch, HIVE-6734.patch


 All DDL operations currently acquire an exclusive lock.  This is too course 
 grained, as some operations like alter table add partition shouldn't get an 
 exclusive lock on the entire table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan

2014-03-28 Thread Selina Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951150#comment-13951150
 ] 

Selina Zhang commented on HIVE-6492:


[~leftylev] Thanks for reminding! We can put 
This controls how many partitions can be scanned for each partitioned table. 
The default value -1 means no limit.  
What do you think? 

 limit partition number involved in a table scan
 ---

 Key: HIVE-6492
 URL: https://issues.apache.org/jira/browse/HIVE-6492
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Selina Zhang
Assignee: Selina Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
 HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, 
 HIVE-6492.5.patch.txt, HIVE-6492.6.patch.txt, HIVE-6492.7.parch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 To protect the cluster, a new configure variable 
 hive.limit.query.max.table.partition is added to hive configuration to
 limit the table partitions involved in a table scan. 
 The default value will be set to -1 which means there is no limit by default. 
 This variable will not affect metadata only query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.

2014-03-28 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951156#comment-13951156
 ] 

Harish Butani commented on HIVE-6686:
-

+1 for 0.13

 webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of 
 log4j.properties file on local filesystem.
 --

 Key: HIVE-6686
 URL: https://issues.apache.org/jira/browse/HIVE-6686
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.13.0

 Attachments: HIVE-6686.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951162#comment-13951162
 ] 

Sergey Shelukhin commented on HIVE-6188:


Will also document bunch of other settings in this JIRA

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6188:
---

Fix Version/s: 0.13.0

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951169#comment-13951169
 ] 

Eric Hanson commented on HIVE-6633:
---

[~rhbutani] Can you approve this to go into 0.13 please?

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego

2014-03-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6697:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to 0.13 branch and trunk.
Thanks for the contribution Dilli!


 HiveServer2 secure thrift/http authentication needs to support SPNego 
 --

 Key: HIVE-6697
 URL: https://issues.apache.org/jira/browse/HIVE-6697
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Fix For: 0.13.0

 Attachments: HIVE-6697.1.patch, HIVE-6697.2.patch, HIVE-6697.3.patch, 
 HIVE-6697.4.patch, hive-6697-req-impl-verify.md


 Looking to integrating Apache Knox to work with HiveServer2 secure 
 thrift/http.
 Found that thrift/http uses some form of Kerberos authentication that is not 
 SPNego. Considering it is going over http protocol, expected it to use SPNego 
 protocol.
 Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase 
 Stargate using SPNego for authentication.
 Requesting that HiveServer2 secure thrift/http authentication support SPNego.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5835) Null pointer exception in DeleteDelegator in templeton code

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951177#comment-13951177
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-5835:
-

The errors are not related to my change and I have locally verified it. cc-ing 
[~thejas] for reviewing this.

 Null pointer exception in DeleteDelegator in templeton code 
 

 Key: HIVE-5835
 URL: https://issues.apache.org/jira/browse/HIVE-5835
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-5835.1.patch


 The following NPE is possible with the current implementation:
 ERROR | 13 Nov 2013 08:01:04,292 | 
 org.apache.hcatalog.templeton.CatchallExceptionMapper | 
 java.lang.NullPointerException
 at org.apache.hcatalog.templeton.tool.JobState.getChildren(JobState.java:180)
 at org.apache.hcatalog.templeton.DeleteDelegator.run(DeleteDelegator.java:51)
 at org.apache.hcatalog.templeton.Server.deleteJobId(Server.java:849)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at 
 com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
 at 
 com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 at 
 com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
 at 
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
 com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at 
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at 
 com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 at 
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480)
 at 
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411)
 at 
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360)
 at 
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350)
 at 
 com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
 at 
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538)
 at 
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
 at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360)
 at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:382)
 at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:85)
 at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331)
 at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
 at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965)
 at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
 at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
 at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
 at org.eclipse.jetty.server.Server.handle(Server.java:349)
 at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449)
 at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:910)
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
 at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
 at 
 org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76)
 at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609)
 at 
 org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45)
 at 
 

[jira] [Updated] (HIVE-6710) Deadlocks seen in transaction handler using mysql

2014-03-28 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6710:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk  0.13. Thanks, Alan!

 Deadlocks seen in transaction handler using mysql
 -

 Key: HIVE-6710
 URL: https://issues.apache.org/jira/browse/HIVE-6710
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6710.patch


 When multiple clients attempt to obtain locks a deadlock on the mysql 
 database occasionally occurs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951190#comment-13951190
 ] 

Harish Butani commented on HIVE-6633:
-

+1 for 0.13

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951218#comment-13951218
 ] 

Owen O'Malley commented on HIVE-6757:
-

The point is that these files are *CREATING* a new *PUBLIC* api for Hive. That 
API is starting deprecated. That just creates confusion and noise. The users 
already need to update their manually installed parquet jars. This is the time 
that imposes the *LEAST* disruption on the users of Apache Hive. If we release 
them then there is user confusion over duplicated classes. Hive users won't 
expect to see classes in parquet.* in the hive-exec jar. *THAT* will create 
brand new user confusion.

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6188:
---

Attachment: HIVE-6188.patch

Doc patch

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6188:
---

Status: Patch Available  (was: Open)

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary

2014-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951257#comment-13951257
 ] 

Thejas M Nair commented on HIVE-6738:
-

Comments on the patch-
- I think it is better to log at debug level instead of info for these 
messages, as it is logged for every request.
- If the proxy user is already set in SessionManager through the url, I think 
we can skip the check in sessionconf.


 HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying 
 intermediary
 

 Key: HIVE-6738
 URL: https://issues.apache.org/jira/browse/HIVE-6738
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Attachments: HIVE-6738.patch, hive-6738-req-impl-verify-rev1.md, 
 hive-6738-req-impl-verify.md


 See already implemented JIra
  https://issues.apache.org/jira/browse/HIVE-5155
 Support secure proxy user access to HiveServer2
 That fix expects the hive.server2.proxy.user parameter to come in Thrift body.
 When an intermediary gateway like Apache Knox is authenticating the end 
 client and then proxying the request to HiveServer2,  it is not practical for 
 the intermediary like Apache Knox to modify thrift content.
 Intermediary like Apache Knox should be able to assert doAs in a query 
 parameter. This paradigm is already established by other Hadoop ecosystem 
 components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be 
 aligned with them.
 The doAs asserted in query parameter should override if doAs specified in 
 Thrift body.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6773) Update readme for ptest2 framework

2014-03-28 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-6773:
---

 Summary: Update readme for ptest2 framework
 Key: HIVE-6773
 URL: https://issues.apache.org/jira/browse/HIVE-6773
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIVE-6773.patch

Approvals dependency is needed for testing.  Need to add instructions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary

2014-03-28 Thread Dilli Arumugam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951261#comment-13951261
 ] 

Dilli Arumugam commented on HIVE-6738:
--

Thanks Thejas for the review.
Would revise code to accommodate for both comments.
Then, attach a new patch.


 HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying 
 intermediary
 

 Key: HIVE-6738
 URL: https://issues.apache.org/jira/browse/HIVE-6738
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Attachments: HIVE-6738.patch, hive-6738-req-impl-verify-rev1.md, 
 hive-6738-req-impl-verify.md


 See already implemented JIra
  https://issues.apache.org/jira/browse/HIVE-5155
 Support secure proxy user access to HiveServer2
 That fix expects the hive.server2.proxy.user parameter to come in Thrift body.
 When an intermediary gateway like Apache Knox is authenticating the end 
 client and then proxying the request to HiveServer2,  it is not practical for 
 the intermediary like Apache Knox to modify thrift content.
 Intermediary like Apache Knox should be able to assert doAs in a query 
 parameter. This paradigm is already established by other Hadoop ecosystem 
 components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be 
 aligned with them.
 The doAs asserted in query parameter should override if doAs specified in 
 Thrift body.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework

2014-03-28 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6773:


Description: 
Approvals dependency is needed for testing.  Need to add instructions.

NO PRECOMMIT TESTS

  was:Approvals dependency is needed for testing.  Need to add instructions.


 Update readme for ptest2 framework
 --

 Key: HIVE-6773
 URL: https://issues.apache.org/jira/browse/HIVE-6773
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIVE-6773.patch


 Approvals dependency is needed for testing.  Need to add instructions.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework

2014-03-28 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6773:


Attachment: HIVE-6773.patch

 Update readme for ptest2 framework
 --

 Key: HIVE-6773
 URL: https://issues.apache.org/jira/browse/HIVE-6773
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIVE-6773.patch


 Approvals dependency is needed for testing.  Need to add instructions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework

2014-03-28 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6773:


Status: Patch Available  (was: Open)

Hi [~brocknoland], adding some missing instruction as we discussed.

 Update readme for ptest2 framework
 --

 Key: HIVE-6773
 URL: https://issues.apache.org/jira/browse/HIVE-6773
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIVE-6773.patch


 Approvals dependency is needed for testing.  Need to add instructions.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego

2014-03-28 Thread Dilli Arumugam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951299#comment-13951299
 ] 

Dilli Arumugam commented on HIVE-6697:
--

[~thejas]
Thanks for committing the patch

 HiveServer2 secure thrift/http authentication needs to support SPNego 
 --

 Key: HIVE-6697
 URL: https://issues.apache.org/jira/browse/HIVE-6697
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Fix For: 0.13.0

 Attachments: HIVE-6697.1.patch, HIVE-6697.2.patch, HIVE-6697.3.patch, 
 HIVE-6697.4.patch, hive-6697-req-impl-verify.md


 Looking to integrating Apache Knox to work with HiveServer2 secure 
 thrift/http.
 Found that thrift/http uses some form of Kerberos authentication that is not 
 SPNego. Considering it is going over http protocol, expected it to use SPNego 
 protocol.
 Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase 
 Stargate using SPNego for authentication.
 Requesting that HiveServer2 secure thrift/http authentication support SPNego.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6763) HiveServer2 in http mode might send same kerberos client ticket in case of concurrent requests resulting in server throwing a replay exception

2014-03-28 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951306#comment-13951306
 ] 

Vaibhav Gumashta commented on HIVE-6763:


[~alangates] Thanks a lot for running the tests. The issue is unrelated but I 
need to incorporate some feedback.

 HiveServer2 in http mode might send same kerberos client ticket in case of 
 concurrent requests resulting in server throwing a replay exception
 --

 Key: HIVE-6763
 URL: https://issues.apache.org/jira/browse/HIVE-6763
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6763.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951310#comment-13951310
 ] 

Hive QA commented on HIVE-6752:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12637458/HIVE-6752.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5499 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testBetweenFilters
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2017/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2017/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12637458

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: merge3.q.out
louter_join_ppr.q.out
load_dyn_part8.q.out
join_map_ppr.q.out
join26.q.out
join32_lessSize.q.out
join33.q.out
join9.q.out
join32.q.out
input42.q.out
input_part1.q.out
input_part2.q.out
input_part7.q.out
input_part9.q.out
input23.q.out
groupby_sort_6.q.out
groupby_ppr.q.out
groupby_map_ppr_multi_distinct.q.out
groupby_map_ppr_multi_distinct.q.out
groupby_map_ppr.q.out
filter_join_breaktask.q.out
columnstats_partlvl.q.out
bucketmapjoin8.q.out
bucketmapjoin9.q.out
bucketmapjoin_negative.q.out
bucketmapjoin_negative2.q.out
bucket3.q.out
auto_sortmerge_join_2.q.out
auto_sortmerge_join_3.q.out
annotate_stats_part.q.out

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, annotate_stats_part.q.out, auto_sortmerge_join_2.q.out, 
 auto_sortmerge_join_3.q.out, bucket3.q.out, bucketmapjoin8.q.out, 
 bucketmapjoin9.q.out, bucketmapjoin_negative.q.out, 
 bucketmapjoin_negative2.q.out, columnstats_partlvl.q.out, 
 filter_join_breaktask.q.out, groupby_map_ppr.q.out, 
 groupby_map_ppr_multi_distinct.q.out, groupby_map_ppr_multi_distinct.q.out, 
 groupby_ppr.q.out, groupby_sort_6.q.out, input23.q.out, input42.q.out, 
 input_part1.q.out, input_part2.q.out, input_part7.q.out, input_part9.q.out, 
 join26.q.out, join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, 
 join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
   

[jira] [Created] (HIVE-6774) Not a valid JAR errors from TestExecDriver

2014-03-28 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6774:


 Summary: Not a valid JAR errors from TestExecDriver
 Key: HIVE-6774
 URL: https://issues.apache.org/jira/browse/HIVE-6774
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere


If I wipe out my local Maven repository and run the command:
mvn clean install -Dtest=TestExecDriver -Phadoop-1

All of the TestExecDriver tests fail with the following errors:
{noformat}
Not a valid JAR: 
/Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar
Execution failed with exit status: 255
Obtaining error information

Task failed!
Task ID:
  null

Logs:

/Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282)
at 
org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460)
at 
org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6774) Not a valid JAR errors from TestExecDriver

2014-03-28 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951324#comment-13951324
 ] 

Jason Dere commented on HIVE-6774:
--

Looks like the test is relying on the hive-exec JAR being installed to the 
local maven repository, and it's not there yet during the maven test phase.  
[~brocknoland], would it be more appropriate for TestExecDriver to be added to 
itests? Or is it expected to run mvn clean install -DskipTests before 
actually running any tests?

 Not a valid JAR errors from TestExecDriver
 

 Key: HIVE-6774
 URL: https://issues.apache.org/jira/browse/HIVE-6774
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere

 If I wipe out my local Maven repository and run the command:
 mvn clean install -Dtest=TestExecDriver -Phadoop-1
 All of the TestExecDriver tests fail with the following errors:
 {noformat}
 Not a valid JAR: 
 /Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar
 Execution failed with exit status: 255
 Obtaining error information
 Task failed!
 Task ID:
   null
 Logs:
 /Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919)
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282)
 at 
 org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460)
 at 
 org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:168)
 at junit.framework.TestCase.runBare(TestCase.java:134)
 at junit.framework.TestResult$1.protect(TestResult.java:110)
 at junit.framework.TestResult.runProtected(TestResult.java:128)
 at junit.framework.TestResult.run(TestResult.java:113)
 at junit.framework.TestCase.run(TestCase.java:124)
 at junit.framework.TestSuite.runTest(TestSuite.java:243)
 at junit.framework.TestSuite.run(TestSuite.java:238)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6774) Not a valid JAR errors from TestExecDriver

2014-03-28 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951326#comment-13951326
 ] 

Brock Noland commented on HIVE-6774:


You must do mnv install before running any tests. Not as part of the install.

 Not a valid JAR errors from TestExecDriver
 

 Key: HIVE-6774
 URL: https://issues.apache.org/jira/browse/HIVE-6774
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere

 If I wipe out my local Maven repository and run the command:
 mvn clean install -Dtest=TestExecDriver -Phadoop-1
 All of the TestExecDriver tests fail with the following errors:
 {noformat}
 Not a valid JAR: 
 /Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar
 Execution failed with exit status: 255
 Obtaining error information
 Task failed!
 Task ID:
   null
 Logs:
 /Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919)
 at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282)
 at 
 org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460)
 at 
 org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:168)
 at junit.framework.TestCase.runBare(TestCase.java:134)
 at junit.framework.TestResult$1.protect(TestResult.java:110)
 at junit.framework.TestResult.runProtected(TestResult.java:128)
 at junit.framework.TestResult.run(TestResult.java:113)
 at junit.framework.TestCase.run(TestCase.java:124)
 at junit.framework.TestSuite.runTest(TestSuite.java:243)
 at junit.framework.TestSuite.run(TestSuite.java:238)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6314) The logging (progress reporting) is too verbose

2014-03-28 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6314:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to trunk and 0.13
thanks Navis

 The logging (progress reporting) is too verbose
 ---

 Key: HIVE-6314
 URL: https://issues.apache.org/jira/browse/HIVE-6314
 Project: Hive
  Issue Type: Bug
Reporter: Sam
Assignee: Navis
  Labels: logger
 Attachments: HIVE-6314.1.patch.txt, HIVE-6314.2.patch


 The progress report is issued every second even when no progress have been 
 made:
 {code}
 2014-01-27 10:35:55,209 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.68 
 sec
 2014-01-27 10:35:56,678 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.68 
 sec
 2014-01-27 10:35:59,344 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.68 
 sec
 2014-01-27 10:36:01,268 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 8.67 sec
 2014-01-27 10:36:03,149 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
 8.67 sec
 {code}
 This pollutes the logs and the screen, and people do not appreciate it as 
 much as the designers might have thought 
 ([http://stackoverflow.com/questions/20849289/how-do-i-limit-log-verbosity-of-hive],
  
 [http://stackoverflow.com/questions/14121543/controlling-the-level-of-verbosity-in-hive]).
 It would be nice to be able to control the level of verbosity (but *not* by 
 the {{-v}} switch!):
 # Make sure that the progress report is only issued where there is something 
 new to report; or
 # Remove all the progress messages; or
 # Make sure that progress is reported only every X sec (instead of every 1 
 second)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-28 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951347#comment-13951347
 ] 

Anthony Hsu commented on HIVE-6570:
---

What concerns does [~appodictic] have?

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Johndee Burks (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951356#comment-13951356
 ] 

Johndee Burks commented on HIVE-6758:
-

The process will stay stopped until it is fore ground. 

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0, 0.12.0
 Environment: CDH4.5
Reporter: Johndee Burks

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Johndee Burks (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951354#comment-13951354
 ] 

Johndee Burks commented on HIVE-6758:
-

[~xuefuz] It as you say when you run without back ground it works. But if I 
read your attempt correctly you did not back ground at the end ''. 

This works

{code}
[root@jrepo1-1 ~]# beeline -u jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n 
johndee -e show tables;
scan complete in 10ms
Connecting to jdbc:hive2://jrepo1-2.ent.cloudera.com:1
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.9.0-cdh4.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
+---+
| tab_name  |
+---+
| j1|
+---+
1 row selected (0.499 seconds)
Hive version 0.9.0-cdh4.1.2 by Apache
Closing: org.apache.hive.jdbc.HiveConnection
{code}

This does not: 

{code}
[root@jrepo1-1 ~]# beeline -u jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n 
johndee -e show tables; 
[1] 32040
[root@jrepo1-1 ~]#

[1]+  Stopped beeline -u 
jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n johndee -e show tables;
{code}

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0, 0.12.0
 Environment: CDH4.5
Reporter: Johndee Burks

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19801: review request for HIVE-6738, HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary

2014-03-28 Thread dilli dorai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19801/
---

Review request for hive, Thejas Nair and Vaibhav Gumashta.


Bugs: HIVE-6738
https://issues.apache.org/jira/browse/HIVE-6738


Repository: hive-git


Description
---

see the jira HIVE-6738


Diffs
-

  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
7f6687e 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
58f3e3b 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
c579db5 

Diff: https://reviews.apache.org/r/19801/diff/


Testing
---

see the attachment to Jira HIVE-6738
https://issues.apache.org/jira/secure/attachment/12637059/hive-6738-req-impl-verify.md


Thanks,

dilli dorai



[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Attachment: HIVE-6752.4.patch

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Open  (was: Patch Available)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951396#comment-13951396
 ] 

Jitendra Nath Pandey commented on HIVE-6752:


Latest patch fixes the test.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6752:
---

Status: Patch Available  (was: Open)

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Status: Open  (was: Patch Available)

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Attachment: HIVE-6662.2.patch

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6662:
---

Status: Patch Available  (was: Open)

Submitting same patch again for pre-commit tests.

 Vector Join operations with DATE columns fail
 -

 Key: HIVE-6662
 URL: https://issues.apache.org/jira/browse/HIVE-6662
 Project: Hive
  Issue Type: Bug
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 0.13.0

 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch


 Trying to generate a DATE column as part of a JOIN's output throws an 
 exception
 {code}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible 
 Long vector column and primitive category DATE
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6726) Hcat cli does not close SessionState

2014-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951433#comment-13951433
 ] 

Sushanth Sowmyan commented on HIVE-6726:


[~thejas]/[~hagleitn], can I bother either of you for a review for this?

 Hcat cli does not close SessionState
 

 Key: HIVE-6726
 URL: https://issues.apache.org/jira/browse/HIVE-6726
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6726.patch


 When running HCat E2E tests, it was observed that hcat cli left Tez sessions 
 on the RM which ultimately die upon timeout. Expected behavior is to clean 
 the Tez sessions immediately upon exit. This is causing slowness in system 
 tests as over time lot of orphan Tez sessions hang around.
 On looking through code, it seems obvious in retrospect because HCatCli 
 starts a SessionState, but does not explicitly call close on them, exiting 
 the jvm through System.exit instead. This needs to be changed to explicitly 
 call SessionState.close() before exiting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5677) Beeline warns about unavailable files if HIVE_OPTS is set

2014-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951436#comment-13951436
 ] 

Sushanth Sowmyan commented on HIVE-5677:


This patch works for me, +1. 

[~xuefuz], I'm afraid I don't know about the remote debugging aspect, how do 
you normally do that?

 Beeline warns about unavailable files if HIVE_OPTS is set
 -

 Key: HIVE-5677
 URL: https://issues.apache.org/jira/browse/HIVE-5677
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Navis
 Attachments: HIVE-5677.1.patch.txt


 NO PRECOMMIT TESTS
 This is similar to HIVE-5085.
 Beeline complains about files not existing if HIVE_OPTS are set.
 In the Beeline commandline sh as well, we should see if setting HIVE_OPTS to 
 ''  makes sense.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6633:
---

Fix Version/s: 0.13.0

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951449#comment-13951449
 ] 

Sushanth Sowmyan commented on HIVE-6633:


Committed to 0.13. Thanks Eric and Harish!

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HIVE-6758:
--

Environment: (was: CDH4.5)

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0, 0.12.0
Reporter: Johndee Burks

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HIVE-6758:
--

Affects Version/s: (was: 0.12.0)

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0
Reporter: Johndee Burks

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/
---

(Updated March 28, 2014, 9:56 p.m.)


Review request for hive and Eric Hanson.


Bugs: HIVE-6752
https://issues.apache.org/jira/browse/HIVE-6752


Repository: hive-git


Description
---

Vectorized Between and IN expressions don't work with decimal, date types.


Diffs (updated)
-

  ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
  ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
2229079 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
96e74a9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
c2240c0 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 
5ebab70 
  ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
  ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/19718/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary

2014-03-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951470#comment-13951470
 ] 

Thejas M Nair commented on HIVE-6738:
-

+1

 HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying 
 intermediary
 

 Key: HIVE-6738
 URL: https://issues.apache.org/jira/browse/HIVE-6738
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Attachments: HIVE-6738.1.patch, HIVE-6738.patch, 
 hive-6738-req-impl-verify-rev1.md, hive-6738-req-impl-verify.md


 See already implemented JIra
  https://issues.apache.org/jira/browse/HIVE-5155
 Support secure proxy user access to HiveServer2
 That fix expects the hive.server2.proxy.user parameter to come in Thrift body.
 When an intermediary gateway like Apache Knox is authenticating the end 
 client and then proxying the request to HiveServer2,  it is not practical for 
 the intermediary like Apache Knox to modify thrift content.
 Intermediary like Apache Knox should be able to assert doAs in a query 
 parameter. This paradigm is already established by other Hadoop ecosystem 
 components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be 
 aligned with them.
 The doAs asserted in query parameter should override if doAs specified in 
 Thrift body.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951480#comment-13951480
 ] 

Hive QA commented on HIVE-6188:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12637478/HIVE-6188.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5498 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2018/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2018/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12637478

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951481#comment-13951481
 ] 

Sergey Shelukhin commented on HIVE-6752:


+1 on most recent changes 

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951483#comment-13951483
 ] 

Jitendra Nath Pandey commented on HIVE-6188:


+1

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl

2014-03-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951488#comment-13951488
 ] 

Sergey Shelukhin commented on HIVE-6188:


Since it's a doc patch I will just commit later today

 Document hive.metastore.try.direct.sql  hive.metastore.try.direct.sql.ddl
 --

 Key: HIVE-6188
 URL: https://issues.apache.org/jira/browse/HIVE-6188
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lefty Leverenz
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6188.patch


 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl 
 configuration properties need to be documented in hive-default.xml.template 
 and the wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs

2014-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951486#comment-13951486
 ] 

Sushanth Sowmyan commented on HIVE-6592:


Looks good to me from reading up man curl :

{noformat}
   -k/--insecure
  (SSL) This option explicitly allows curl to perform insecure 
SSL connections and transfers. All SSL connections  are  attempted  to  be
  made  secure  by  using  the  CA  certificate  bundle  installed 
by default. This makes all connections considered insecure fail unless
  -k/--insecure is used.

  See this online resource for further details: 
http://curl.haxx.se/docs/sslcerts.html
{noformat}

+1.

 WebHCat E2E test abort when pointing to https url of webhdfs
 

 Key: HIVE-6592
 URL: https://issues.apache.org/jira/browse/HIVE-6592
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.13.0

 Attachments: HIVE-6592.patch


 WebHCat E2E tests when running against a ssl enabled webhdfs url fails.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6758) Beeline only works in interactive mode

2014-03-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-6758:
-

Assignee: Xuefu Zhang

 Beeline only works in interactive mode
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0
Reporter: Johndee Burks
Assignee: Xuefu Zhang

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs

2014-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6592:
---

Fix Version/s: (was: 0.13.0)
   0.14.0

 WebHCat E2E test abort when pointing to https url of webhdfs
 

 Key: HIVE-6592
 URL: https://issues.apache.org/jira/browse/HIVE-6592
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-6592.patch


 WebHCat E2E tests when running against a ssl enabled webhdfs url fails.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs

2014-03-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951492#comment-13951492
 ] 

Sushanth Sowmyan commented on HIVE-6592:


Committed to trunk, Thanks, Deepesh! Setting the fix version to 0.14.

[~rhbutani], Deepesh would like to get this included in 0.13 as well. I think 
it makes sense for inclusion, since it's needed to allow our E2E tests to run 
in a secure environment. Could we backport this?

 WebHCat E2E test abort when pointing to https url of webhdfs
 

 Key: HIVE-6592
 URL: https://issues.apache.org/jira/browse/HIVE-6592
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-6592.patch


 WebHCat E2E tests when running against a ssl enabled webhdfs url fails.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6758) Beeline doesn't work with -e option when started in background

2014-03-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6758:
--

Summary: Beeline doesn't work with -e option when started in background  
(was: Beeline only works in interactive mode)

 Beeline doesn't work with -e option when started in background
 --

 Key: HIVE-6758
 URL: https://issues.apache.org/jira/browse/HIVE-6758
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0
Reporter: Johndee Burks
Assignee: Xuefu Zhang

 In hive CLI you could easily integrate its use into a script and back ground 
 the process like this: 
 hive -e some query 
 Beeline does not run when you do the same even with the -f switch. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs

2014-03-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6592:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 WebHCat E2E test abort when pointing to https url of webhdfs
 

 Key: HIVE-6592
 URL: https://issues.apache.org/jira/browse/HIVE-6592
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-6592.patch


 WebHCat E2E tests when running against a ssl enabled webhdfs url fails.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-03-28 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951499#comment-13951499
 ] 

Harish Butani commented on HIVE-6757:
-

Hi Justin, Brock,

Couple of questions/thoughts:

1.
What if we include the parquet-hive.jar in the hive-exec shaded jar? Does this 
mitigate the upgrade issues for existing users? 

2. 
If they choose to how will existing users migrate to the new classes? Do we 
provide metadata upgrade scripts? Do we have to support their existing sql 
code: for e.g. we add checks in the hive parsing layer to replace old parquet 
class references with new classes.
So the migration process when we remove(now or in the future) the deprecated 
classes is not clear. Can you guys please help me understand how this will play 
out.  

 Remove deprecated parquet classes from outside of org.apache package
 

 Key: HIVE-6757
 URL: https://issues.apache.org/jira/browse/HIVE-6757
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6757.patch, parquet-hive.patch


 Apache shouldn't release projects with files outside of the org.apache 
 namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs

2014-03-28 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951517#comment-13951517
 ] 

Harish Butani commented on HIVE-6592:
-

+1 for 0.13

 WebHCat E2E test abort when pointing to https url of webhdfs
 

 Key: HIVE-6592
 URL: https://issues.apache.org/jira/browse/HIVE-6592
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-6592.patch


 WebHCat E2E tests when running against a ssl enabled webhdfs url fails.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5677) Beeline warns about unavailable files if HIVE_OPTS is set

2014-03-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951523#comment-13951523
 ] 

Xuefu Zhang commented on HIVE-5677:
---

[~sushanth] I was talking about debug options that usually set on HADOOP_OPTS. 
In hive script, HIVE_OPTS takes what HADOOP_OPTS sets, so debug can be enabled.

Nevertheless, this patch is no longer necessary after HIVE-6652. Please try and 
let me know if the problem remains. 

 Beeline warns about unavailable files if HIVE_OPTS is set
 -

 Key: HIVE-5677
 URL: https://issues.apache.org/jira/browse/HIVE-5677
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Navis
 Attachments: HIVE-5677.1.patch.txt


 NO PRECOMMIT TESTS
 This is similar to HIVE-5085.
 Beeline complains about files not existing if HIVE_OPTS are set.
 In the Beeline commandline sh as well, we should see if setting HIVE_OPTS to 
 ''  makes sense.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: annotate_stats_part.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, bucket3.q.out, bucketmapjoin8.q.out, bucketmapjoin9.q.out, 
 bucketmapjoin_negative.q.out, bucketmapjoin_negative2.q.out, 
 columnstats_partlvl.q.out, filter_join_breaktask.q.out, 
 groupby_map_ppr.q.out, groupby_map_ppr_multi_distinct.q.out, 
 groupby_map_ppr_multi_distinct.q.out, groupby_ppr.q.out, 
 groupby_sort_6.q.out, input23.q.out, input42.q.out, input_part1.q.out, 
 input_part2.q.out, input_part7.q.out, input_part9.q.out, join26.q.out, 
 join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, 
 join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, 
 merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, 
 ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, 
 rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, 
 router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, 
 smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, 
 stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: auto_sortmerge_join_2.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, bucket3.q.out, bucketmapjoin8.q.out, bucketmapjoin9.q.out, 
 bucketmapjoin_negative.q.out, bucketmapjoin_negative2.q.out, 
 columnstats_partlvl.q.out, filter_join_breaktask.q.out, 
 groupby_map_ppr.q.out, groupby_map_ppr_multi_distinct.q.out, 
 groupby_map_ppr_multi_distinct.q.out, groupby_ppr.q.out, 
 groupby_sort_6.q.out, input23.q.out, input42.q.out, input_part1.q.out, 
 input_part2.q.out, input_part7.q.out, input_part9.q.out, join26.q.out, 
 join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, 
 join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, 
 merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, 
 ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, 
 rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, 
 router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, 
 smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, 
 stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: bucket3.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: join_map_ppr.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: columnstats_partlvl.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: input_part9.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: bucketmapjoin8.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: groupby_map_ppr_multi_distinct.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: filter_join_breaktask.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: groupby_map_ppr.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: outer_join_ppr.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, 
 push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, 
 router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, 
 smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, 
 stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   Reduce Output Operator
 key expressions: _col9 (type: timestamp)

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: input_part2.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: groupby_ppr.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: bucketmapjoin_negative2.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: bucketmapjoin9.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: input23.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: join32_lessSize.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: groupby_sort_6.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: groupby_map_ppr_multi_distinct.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: input_part7.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: join32.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
   Reduce 

[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression

2014-03-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6642:


Attachment: (was: bucketmapjoin_negative.q.out)

 Query fails to vectorize when a non string partition column is part of the 
 query expression
 ---

 Key: HIVE-6642
 URL: https://issues.apache.org/jira/browse/HIVE-6642
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.13.0

 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, 
 HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, 
 metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, 
 ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, 
 rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, 
 sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, 
 sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, 
 transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out


 drop table if exists alltypesorc_part;
 CREATE TABLE alltypesorc_part (
 ctinyint tinyint,
 csmallint smallint,
 cint int,
 cbigint bigint,
 cfloat float,
 cdouble double,
 cstring1 string,
 cstring2 string,
 ctimestamp1 timestamp,
 ctimestamp2 timestamp,
 cboolean1 boolean,
 cboolean2 boolean) partitioned by (ds int) STORED AS ORC;
 insert overwrite table alltypesorc_part partition (ds=2011) select * from 
 alltypesorc limit 100;
 insert overwrite table alltypesorc_part partition (ds=2012) select * from 
 alltypesorc limit 200;
 explain select *
 from (select ds from alltypesorc_part) t1,
  alltypesorc t2
 where t1.ds = t2.cint
 order by t2.ctimestamp1
 limit 100;
 The above query fails to vectorize because (select ds from alltypesorc_part) 
 t1 returns a string column and the join equality on t2 is performed on an int 
 column. The correct output when vectorization is turned on should be:
 STAGE DEPENDENCIES:
   Stage-5 is a root stage
   Stage-2 depends on stages: Stage-5
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 t1:alltypesorc_part
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 t1:alltypesorc_part
   TableScan
 alias: alltypesorc_part
 Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Select Operator
   expressions: ds (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE 
 Column stats: COMPLETE
   HashTable Sink Operator
 condition expressions:
   0 {_col0}
   1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} 
 {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} 
 {cboolean2}
 keys:
   0 _col0 (type: int)
   1 cint (type: int)
   Stage: Stage-2
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: t2
 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: 
 COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col0}
 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} 
 {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2}
   keys:
 0 _col0 (type: int)
 1 cint (type: int)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, 
 _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 3889 Data size: 1244882 Basic stats: 
 COMPLETE Column stats: NONE
   Filter Operator
 predicate: (_col0 = _col3) (type: boolean)
 Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
 Select Operator
   expressions: _col0 (type: int), _col1 (type: tinyint), 
 _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: 
 float), _col6 (type: double), _col7 (type: string), _col8 (type: string), 
 _col\
 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 
 (type: boolean)
   outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
 _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
   Statistics: Num rows: 1944 Data size: 622280 Basic stats: 
 COMPLETE Column stats: NONE
  

  1   2   3   >