[jira] [Updated] (HIVE-4975) Reading orc file throws exception after adding new column
[ https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4975: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk and branch. Thanks [~kevinwilfong]! Reading orc file throws exception after adding new column - Key: HIVE-4975 URL: https://issues.apache.org/jira/browse/HIVE-4975 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0 Environment: hive 0.11.0 hadoop 1.0.0 Reporter: cyril liao Assignee: Kevin Wilfong Priority: Critical Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-4975.1.patch.txt, HIVE-4975.2.patch ORC file read failure after add table column. create a table which have three column .(a string,b string,c string). add a new column after c by executing ALTER TABLE table ADD COLUMNS (d string). execute hiveql select d from table,the following exception goes: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: 4 at org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206) at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128) at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) ] at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ArrayIndexOutOfBoundsException: 4 at org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206) at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128) at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236) at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) ] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating d
[jira] [Commented] (HIVE-6771) Update WebHCat E2E tests now that comments is reported correctly in describe table output
[ https://issues.apache.org/jira/browse/HIVE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950424#comment-13950424 ] Ashutosh Chauhan commented on HIVE-6771: +1 Update WebHCat E2E tests now that comments is reported correctly in describe table output --- Key: HIVE-6771 URL: https://issues.apache.org/jira/browse/HIVE-6771 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.13.0 Attachments: HIVE-6771.patch HIVE-6681 corrected the comments in the describe table output, earlier it would show from deserializer in comments. Some WebHCat E2E tests are checking for the string from deserializer even overshadowing the actual comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6447) Bucket map joins in hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6447: - Resolution: Fixed Fix Version/s: 0.14.0 0.13.0 Status: Resolved (was: Patch Available) Bucket map joins in hive-tez Key: HIVE-6447 URL: https://issues.apache.org/jira/browse/HIVE-6447 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, HIVE-6447.WIP.patch Support bucket map joins in tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6447) Bucket map joins in hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950427#comment-13950427 ] Vikram Dixit K commented on HIVE-6447: -- Committed to trunk and branch-0.13. Thanks for the reviews [~sseth], [~hagleitn], [~rhbutani]. Thanks [~alangates] [~thejas] for the test runs. Bucket map joins in hive-tez Key: HIVE-6447 URL: https://issues.apache.org/jira/browse/HIVE-6447 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6447.1.patch, HIVE-6447.10.patch, HIVE-6447.11.patch, HIVE-6447.12.patch, HIVE-6447.13.patch, HIVE-6447.2.patch, HIVE-6447.3.patch, HIVE-6447.4.patch, HIVE-6447.5.patch, HIVE-6447.6.patch, HIVE-6447.7.patch, HIVE-6447.8.patch, HIVE-6447.9.patch, HIVE-6447.WIP.patch Support bucket map joins in tez. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6765) ASTNodeOrigin unserializable leads to fail when join with view
[ https://issues.apache.org/jira/browse/HIVE-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang updated HIVE-6765: -- Component/s: (was: Query Processor) ASTNodeOrigin unserializable leads to fail when join with view -- Key: HIVE-6765 URL: https://issues.apache.org/jira/browse/HIVE-6765 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Adrian Wang Attachments: HIVE-6765.patch.1 when a view contains a UDF, and the view comes into a JOIN operation, Hive will encounter a bug with stack trace like Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin at java.lang.Class.newInstance0(Class.java:359) at java.lang.Class.newInstance(Class.java:327) at sun.reflect.GeneratedMethodAccessor84.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950531#comment-13950531 ] Justin Coffey commented on HIVE-6757: - Owen, the solution your proposing means that there is no seamless upgrade path for existing parquet-hive users and that somewhere on the hive wiki there will have to be a call out attention existing parquet users, you must include the parquet-hive.jar when upgrading to hive 13. we're sorry, but this is the price you have to pay for being an early adopter and driving functionality. One of the goals of the #HIVE-5783 patch was to make the lives of parquet users easier (there were of course many other reasons, but ease of use is a good goal in and of itself). The classes as they are do no harm and it's hard to see how they pollute the code base of Hive in any significant way. This patch kinda sorta seems a tiny bit punitive if you ask me. Please don't take any of this the wrong way, but I believe this is what a fair chunk of the parquet-hive community might think if this patch is committed. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950853#comment-13950853 ] Brock Noland commented on HIVE-6757: Great points Justin. Many folks in the Hive community want this code, which is not against any Apache or Hive policy. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950911#comment-13950911 ] Owen O'Malley commented on HIVE-6757: - Justin, They already have parquet-hive.jar that they've manually added to their installation. Giving them an upgraded jar to work with Hive 0.13 is a better answer than making conflicting classes in Hive itself. In fact, the way that HIVE-5783 was done imposes a significant chance that class conflicts will occur for users that have manually installed the parquet jars. I'm not trying to force reverting HIVE-5783 out of Hive 0.13, but leaving these classes in the parquet jars and not in Hive is a much better answer. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6767) Golden file updates for hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6767: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to 0.13 trunk. Golden file updates for hadoop-2 Key: HIVE-6767 URL: https://issues.apache.org/jira/browse/HIVE-6767 Project: Hive Issue Type: Task Components: Tests Affects Versions: 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-6767.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950921#comment-13950921 ] Xuefu Zhang commented on HIVE-6757: --- If removing the code helps Hive functionally or performance-widely, I may be convinced by the proposal of removing this small piece of code. Based on what we gain by doing this removal, it's hard to be convincing that this benefits anything if at all, while discouraging some hive/parquet users who really care. For most of other Hive users, who cares about the extra two classes they don't need to bother with. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951057#comment-13951057 ] Eric Hanson commented on HIVE-6633: --- [~thejas] Can you commit this to 0.13 please? pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6772) Virtual columns when used with Lateral View Explode results in SemanticException [Error 10004]
Steve Ogden created HIVE-6772: - Summary: Virtual columns when used with Lateral View Explode results in SemanticException [Error 10004] Key: HIVE-6772 URL: https://issues.apache.org/jira/browse/HIVE-6772 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Environment: Red Hat Enterprise Linux Server release 6.3 (Santiago) Hadoop 2.0.0-cdh4.1.2 Hive 0.9.0 Reporter: Steve Ogden Priority: Minor When using the virtual columns with 'lateral view explode', I get the following error: FAILED: SemanticException [Error 10004]: Line 3:22 Invalid table alias or column reference 'INPUT__FILE__NAME': (possible column names are: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22) Here is the query: select newMd5(concat(INPUT__FILE__NAME,BLOCK__OFFSET__INSIDE__FILE)) ukey, flat_ric_cd as ric_cd from edwpoc.ts_rtd_gs_stg lateral view explode(split(ric_cd,',')) subView as flat_ric_cd -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13950996#comment-13950996 ] Justin Coffey commented on HIVE-6757: - I guess my point is simply that early adopters are penalized for life whereas new users get the full benefit of the patch. I agree that the penalty is pretty small, but the two classes kicking around in the parquet package are even less of a penalty to the hive code base. Thus I remain against pulling them out. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6771) Update WebHCat E2E tests now that comments is reported correctly in describe table output
[ https://issues.apache.org/jira/browse/HIVE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6771: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to 0.13 trunk. Thanks, Deepesh! Update WebHCat E2E tests now that comments is reported correctly in describe table output --- Key: HIVE-6771 URL: https://issues.apache.org/jira/browse/HIVE-6771 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.13.0 Attachments: HIVE-6771.patch HIVE-6681 corrected the comments in the describe table output, earlier it would show from deserializer in comments. Some WebHCat E2E tests are checking for the string from deserializer even overshadowing the actual comments. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Open (was: Patch Available) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Patch Available (was: Open) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19789: HIVE-6739 Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19789/ --- Review request for hive, Gunther Hagleitner and Vikram Dixit Kumaraswamy. Repository: hive-git Description --- See jira Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java c247030 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 720b8d5 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 5f0f353 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 5dd8f98 ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java fdbd996 ql/src/java/org/apache/hadoop/hive/ql/plan/BaseWork.java 38c4c11 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java e1cc3f4 ql/src/java/org/apache/hadoop/hive/ql/plan/TezWork.java f974c57 ql/src/java/org/apache/hadoop/hive/ql/plan/UnionWork.java 60781e6 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 78f1a8f ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java d2c332c ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java 5ad4250 ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 859b5ad Diff: https://reviews.apache.org/r/19789/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Resolved] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang resolved HIVE-6758. --- Resolution: Cannot Reproduce Closed as unreproducible. Feel free to reopen it if repo case can be provided. Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.12.0 Environment: CDH4.5 Reporter: Johndee Burks In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Attachment: HIVE-6752.3.patch Patch updated with a small bug fix identified in testing. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6734) DDL locking too course grained in new db txn manager
[ https://issues.apache.org/jira/browse/HIVE-6734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6734: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to 0.13 trunk. DDL locking too course grained in new db txn manager Key: HIVE-6734 URL: https://issues.apache.org/jira/browse/HIVE-6734 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: HIVE-6734.patch, HIVE-6734.patch All DDL operations currently acquire an exclusive lock. This is too course grained, as some operations like alter table add partition shouldn't get an exclusive lock on the entire table. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan
[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951150#comment-13951150 ] Selina Zhang commented on HIVE-6492: [~leftylev] Thanks for reminding! We can put This controls how many partitions can be scanned for each partitioned table. The default value -1 means no limit. What do you think? limit partition number involved in a table scan --- Key: HIVE-6492 URL: https://issues.apache.org/jira/browse/HIVE-6492 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: Selina Zhang Assignee: Selina Zhang Fix For: 0.13.0 Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, HIVE-6492.3.patch.txt, HIVE-6492.4.patch.txt, HIVE-6492.4.patch_suggestion, HIVE-6492.5.patch.txt, HIVE-6492.6.patch.txt, HIVE-6492.7.parch.txt Original Estimate: 24h Remaining Estimate: 24h To protect the cluster, a new configure variable hive.limit.query.max.table.partition is added to hive configuration to limit the table partitions involved in a table scan. The default value will be set to -1 which means there is no limit by default. This variable will not affect metadata only query. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.
[ https://issues.apache.org/jira/browse/HIVE-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951156#comment-13951156 ] Harish Butani commented on HIVE-6686: - +1 for 0.13 webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem. -- Key: HIVE-6686 URL: https://issues.apache.org/jira/browse/HIVE-6686 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-6686.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951162#comment-13951162 ] Sergey Shelukhin commented on HIVE-6188: Will also document bunch of other settings in this JIRA Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6188: --- Fix Version/s: 0.13.0 Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951169#comment-13951169 ] Eric Hanson commented on HIVE-6633: --- [~rhbutani] Can you approve this to go into 0.13 please? pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego
[ https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6697: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Patch committed to 0.13 branch and trunk. Thanks for the contribution Dilli! HiveServer2 secure thrift/http authentication needs to support SPNego -- Key: HIVE-6697 URL: https://issues.apache.org/jira/browse/HIVE-6697 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Fix For: 0.13.0 Attachments: HIVE-6697.1.patch, HIVE-6697.2.patch, HIVE-6697.3.patch, HIVE-6697.4.patch, hive-6697-req-impl-verify.md Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http. Found that thrift/http uses some form of Kerberos authentication that is not SPNego. Considering it is going over http protocol, expected it to use SPNego protocol. Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase Stargate using SPNego for authentication. Requesting that HiveServer2 secure thrift/http authentication support SPNego. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5835) Null pointer exception in DeleteDelegator in templeton code
[ https://issues.apache.org/jira/browse/HIVE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951177#comment-13951177 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-5835: - The errors are not related to my change and I have locally verified it. cc-ing [~thejas] for reviewing this. Null pointer exception in DeleteDelegator in templeton code Key: HIVE-5835 URL: https://issues.apache.org/jira/browse/HIVE-5835 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-5835.1.patch The following NPE is possible with the current implementation: ERROR | 13 Nov 2013 08:01:04,292 | org.apache.hcatalog.templeton.CatchallExceptionMapper | java.lang.NullPointerException at org.apache.hcatalog.templeton.tool.JobState.getChildren(JobState.java:180) at org.apache.hcatalog.templeton.DeleteDelegator.run(DeleteDelegator.java:51) at org.apache.hcatalog.templeton.Server.deleteJobId(Server.java:849) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:382) at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:85) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:349) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:910) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45) at
[jira] [Updated] (HIVE-6710) Deadlocks seen in transaction handler using mysql
[ https://issues.apache.org/jira/browse/HIVE-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6710: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk 0.13. Thanks, Alan! Deadlocks seen in transaction handler using mysql - Key: HIVE-6710 URL: https://issues.apache.org/jira/browse/HIVE-6710 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: HIVE-6710.patch When multiple clients attempt to obtain locks a deadlock on the mysql database occasionally occurs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951190#comment-13951190 ] Harish Butani commented on HIVE-6633: - +1 for 0.13 pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951218#comment-13951218 ] Owen O'Malley commented on HIVE-6757: - The point is that these files are *CREATING* a new *PUBLIC* api for Hive. That API is starting deprecated. That just creates confusion and noise. The users already need to update their manually installed parquet jars. This is the time that imposes the *LEAST* disruption on the users of Apache Hive. If we release them then there is user confusion over duplicated classes. Hive users won't expect to see classes in parquet.* in the hive-exec jar. *THAT* will create brand new user confusion. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6188: --- Attachment: HIVE-6188.patch Doc patch Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6188: --- Status: Patch Available (was: Open) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary
[ https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951257#comment-13951257 ] Thejas M Nair commented on HIVE-6738: - Comments on the patch- - I think it is better to log at debug level instead of info for these messages, as it is logged for every request. - If the proxy user is already set in SessionManager through the url, I think we can skip the check in sessionconf. HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary Key: HIVE-6738 URL: https://issues.apache.org/jira/browse/HIVE-6738 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Attachments: HIVE-6738.patch, hive-6738-req-impl-verify-rev1.md, hive-6738-req-impl-verify.md See already implemented JIra https://issues.apache.org/jira/browse/HIVE-5155 Support secure proxy user access to HiveServer2 That fix expects the hive.server2.proxy.user parameter to come in Thrift body. When an intermediary gateway like Apache Knox is authenticating the end client and then proxying the request to HiveServer2, it is not practical for the intermediary like Apache Knox to modify thrift content. Intermediary like Apache Knox should be able to assert doAs in a query parameter. This paradigm is already established by other Hadoop ecosystem components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be aligned with them. The doAs asserted in query parameter should override if doAs specified in Thrift body. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6773) Update readme for ptest2 framework
Szehon Ho created HIVE-6773: --- Summary: Update readme for ptest2 framework Key: HIVE-6773 URL: https://issues.apache.org/jira/browse/HIVE-6773 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Priority: Minor Attachments: HIVE-6773.patch Approvals dependency is needed for testing. Need to add instructions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary
[ https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951261#comment-13951261 ] Dilli Arumugam commented on HIVE-6738: -- Thanks Thejas for the review. Would revise code to accommodate for both comments. Then, attach a new patch. HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary Key: HIVE-6738 URL: https://issues.apache.org/jira/browse/HIVE-6738 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Attachments: HIVE-6738.patch, hive-6738-req-impl-verify-rev1.md, hive-6738-req-impl-verify.md See already implemented JIra https://issues.apache.org/jira/browse/HIVE-5155 Support secure proxy user access to HiveServer2 That fix expects the hive.server2.proxy.user parameter to come in Thrift body. When an intermediary gateway like Apache Knox is authenticating the end client and then proxying the request to HiveServer2, it is not practical for the intermediary like Apache Knox to modify thrift content. Intermediary like Apache Knox should be able to assert doAs in a query parameter. This paradigm is already established by other Hadoop ecosystem components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be aligned with them. The doAs asserted in query parameter should override if doAs specified in Thrift body. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework
[ https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6773: Description: Approvals dependency is needed for testing. Need to add instructions. NO PRECOMMIT TESTS was:Approvals dependency is needed for testing. Need to add instructions. Update readme for ptest2 framework -- Key: HIVE-6773 URL: https://issues.apache.org/jira/browse/HIVE-6773 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Priority: Minor Attachments: HIVE-6773.patch Approvals dependency is needed for testing. Need to add instructions. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework
[ https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6773: Attachment: HIVE-6773.patch Update readme for ptest2 framework -- Key: HIVE-6773 URL: https://issues.apache.org/jira/browse/HIVE-6773 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Priority: Minor Attachments: HIVE-6773.patch Approvals dependency is needed for testing. Need to add instructions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6773) Update readme for ptest2 framework
[ https://issues.apache.org/jira/browse/HIVE-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6773: Status: Patch Available (was: Open) Hi [~brocknoland], adding some missing instruction as we discussed. Update readme for ptest2 framework -- Key: HIVE-6773 URL: https://issues.apache.org/jira/browse/HIVE-6773 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Priority: Minor Attachments: HIVE-6773.patch Approvals dependency is needed for testing. Need to add instructions. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego
[ https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951299#comment-13951299 ] Dilli Arumugam commented on HIVE-6697: -- [~thejas] Thanks for committing the patch HiveServer2 secure thrift/http authentication needs to support SPNego -- Key: HIVE-6697 URL: https://issues.apache.org/jira/browse/HIVE-6697 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Fix For: 0.13.0 Attachments: HIVE-6697.1.patch, HIVE-6697.2.patch, HIVE-6697.3.patch, HIVE-6697.4.patch, hive-6697-req-impl-verify.md Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http. Found that thrift/http uses some form of Kerberos authentication that is not SPNego. Considering it is going over http protocol, expected it to use SPNego protocol. Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase Stargate using SPNego for authentication. Requesting that HiveServer2 secure thrift/http authentication support SPNego. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6763) HiveServer2 in http mode might send same kerberos client ticket in case of concurrent requests resulting in server throwing a replay exception
[ https://issues.apache.org/jira/browse/HIVE-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951306#comment-13951306 ] Vaibhav Gumashta commented on HIVE-6763: [~alangates] Thanks a lot for running the tests. The issue is unrelated but I need to incorporate some feedback. HiveServer2 in http mode might send same kerberos client ticket in case of concurrent requests resulting in server throwing a replay exception -- Key: HIVE-6763 URL: https://issues.apache.org/jira/browse/HIVE-6763 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6763.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951310#comment-13951310 ] Hive QA commented on HIVE-6752: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12637458/HIVE-6752.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5499 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testBetweenFilters {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2017/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2017/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12637458 Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: merge3.q.out louter_join_ppr.q.out load_dyn_part8.q.out join_map_ppr.q.out join26.q.out join32_lessSize.q.out join33.q.out join9.q.out join32.q.out input42.q.out input_part1.q.out input_part2.q.out input_part7.q.out input_part9.q.out input23.q.out groupby_sort_6.q.out groupby_ppr.q.out groupby_map_ppr_multi_distinct.q.out groupby_map_ppr_multi_distinct.q.out groupby_map_ppr.q.out filter_join_breaktask.q.out columnstats_partlvl.q.out bucketmapjoin8.q.out bucketmapjoin9.q.out bucketmapjoin_negative.q.out bucketmapjoin_negative2.q.out bucket3.q.out auto_sortmerge_join_2.q.out auto_sortmerge_join_3.q.out annotate_stats_part.q.out Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, annotate_stats_part.q.out, auto_sortmerge_join_2.q.out, auto_sortmerge_join_3.q.out, bucket3.q.out, bucketmapjoin8.q.out, bucketmapjoin9.q.out, bucketmapjoin_negative.q.out, bucketmapjoin_negative2.q.out, columnstats_partlvl.q.out, filter_join_breaktask.q.out, groupby_map_ppr.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_ppr.q.out, groupby_sort_6.q.out, input23.q.out, input42.q.out, input_part1.q.out, input_part2.q.out, input_part7.q.out, input_part9.q.out, join26.q.out, join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0}
[jira] [Created] (HIVE-6774) Not a valid JAR errors from TestExecDriver
Jason Dere created HIVE-6774: Summary: Not a valid JAR errors from TestExecDriver Key: HIVE-6774 URL: https://issues.apache.org/jira/browse/HIVE-6774 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere If I wipe out my local Maven repository and run the command: mvn clean install -Dtest=TestExecDriver -Phadoop-1 All of the TestExecDriver tests fail with the following errors: {noformat} Not a valid JAR: /Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar Execution failed with exit status: 255 Obtaining error information Task failed! Task ID: null Logs: /Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282) at org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6774) Not a valid JAR errors from TestExecDriver
[ https://issues.apache.org/jira/browse/HIVE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951324#comment-13951324 ] Jason Dere commented on HIVE-6774: -- Looks like the test is relying on the hive-exec JAR being installed to the local maven repository, and it's not there yet during the maven test phase. [~brocknoland], would it be more appropriate for TestExecDriver to be added to itests? Or is it expected to run mvn clean install -DskipTests before actually running any tests? Not a valid JAR errors from TestExecDriver Key: HIVE-6774 URL: https://issues.apache.org/jira/browse/HIVE-6774 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere If I wipe out my local Maven repository and run the command: mvn clean install -Dtest=TestExecDriver -Phadoop-1 All of the TestExecDriver tests fail with the following errors: {noformat} Not a valid JAR: /Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar Execution failed with exit status: 255 Obtaining error information Task failed! Task ID: null Logs: /Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282) at org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6774) Not a valid JAR errors from TestExecDriver
[ https://issues.apache.org/jira/browse/HIVE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951326#comment-13951326 ] Brock Noland commented on HIVE-6774: You must do mnv install before running any tests. Not as part of the install. Not a valid JAR errors from TestExecDriver Key: HIVE-6774 URL: https://issues.apache.org/jira/browse/HIVE-6774 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere If I wipe out my local Maven repository and run the command: mvn clean install -Dtest=TestExecDriver -Phadoop-1 All of the TestExecDriver tests fail with the following errors: {noformat} Not a valid JAR: /Users/jdere/.m2/repository/org/apache/hive/hive-exec/0.14.0-SNAPSHOT/hive-exec-0.14.0-SNAPSHOT.jar Execution failed with exit status: 255 Obtaining error information Task failed! Task ID: null Logs: /Users/jdere/dev/hive.git/ql/target/tmp/log/hive.log java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.addLocalMapRedErrors(SessionState.java:919) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:282) at org.apache.hadoop.hive.ql.exec.TestExecDriver.executePlan(TestExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapPlan1(TestExecDriver.java:474) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6314) The logging (progress reporting) is too verbose
[ https://issues.apache.org/jira/browse/HIVE-6314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6314: Resolution: Fixed Status: Resolved (was: Patch Available) committed to trunk and 0.13 thanks Navis The logging (progress reporting) is too verbose --- Key: HIVE-6314 URL: https://issues.apache.org/jira/browse/HIVE-6314 Project: Hive Issue Type: Bug Reporter: Sam Assignee: Navis Labels: logger Attachments: HIVE-6314.1.patch.txt, HIVE-6314.2.patch The progress report is issued every second even when no progress have been made: {code} 2014-01-27 10:35:55,209 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.68 sec 2014-01-27 10:35:56,678 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.68 sec 2014-01-27 10:35:59,344 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.68 sec 2014-01-27 10:36:01,268 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 8.67 sec 2014-01-27 10:36:03,149 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 8.67 sec {code} This pollutes the logs and the screen, and people do not appreciate it as much as the designers might have thought ([http://stackoverflow.com/questions/20849289/how-do-i-limit-log-verbosity-of-hive], [http://stackoverflow.com/questions/14121543/controlling-the-level-of-verbosity-in-hive]). It would be nice to be able to control the level of verbosity (but *not* by the {{-v}} switch!): # Make sure that the progress report is only issued where there is something new to report; or # Remove all the progress messages; or # Make sure that progress is reported only every X sec (instead of every 1 second) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command
[ https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951347#comment-13951347 ] Anthony Hsu commented on HIVE-6570: --- What concerns does [~appodictic] have? Hive variable substitution does not work with the source command -- Key: HIVE-6570 URL: https://issues.apache.org/jira/browse/HIVE-6570 Project: Hive Issue Type: Bug Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6570.1.patch The following does not work: {code} source ${hivevar:test-dir}/test.q; {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951356#comment-13951356 ] Johndee Burks commented on HIVE-6758: - The process will stay stopped until it is fore ground. Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.12.0 Environment: CDH4.5 Reporter: Johndee Burks In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951354#comment-13951354 ] Johndee Burks commented on HIVE-6758: - [~xuefuz] It as you say when you run without back ground it works. But if I read your attempt correctly you did not back ground at the end ''. This works {code} [root@jrepo1-1 ~]# beeline -u jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n johndee -e show tables; scan complete in 10ms Connecting to jdbc:hive2://jrepo1-2.ent.cloudera.com:1 Connected to: Hive (version 0.10.0) Driver: Hive (version 0.9.0-cdh4.1.2) Transaction isolation: TRANSACTION_REPEATABLE_READ +---+ | tab_name | +---+ | j1| +---+ 1 row selected (0.499 seconds) Hive version 0.9.0-cdh4.1.2 by Apache Closing: org.apache.hive.jdbc.HiveConnection {code} This does not: {code} [root@jrepo1-1 ~]# beeline -u jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n johndee -e show tables; [1] 32040 [root@jrepo1-1 ~]# [1]+ Stopped beeline -u jdbc:hive2://jrepo1-2.ent.cloudera.com:1 -n johndee -e show tables; {code} Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.12.0 Environment: CDH4.5 Reporter: Johndee Burks In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19801: review request for HIVE-6738, HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19801/ --- Review request for hive, Thejas Nair and Vaibhav Gumashta. Bugs: HIVE-6738 https://issues.apache.org/jira/browse/HIVE-6738 Repository: hive-git Description --- see the jira HIVE-6738 Diffs - service/src/java/org/apache/hive/service/cli/session/SessionManager.java 7f6687e service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 58f3e3b service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java c579db5 Diff: https://reviews.apache.org/r/19801/diff/ Testing --- see the attachment to Jira HIVE-6738 https://issues.apache.org/jira/secure/attachment/12637059/hive-6738-req-impl-verify.md Thanks, dilli dorai
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Attachment: HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Open (was: Patch Available) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951396#comment-13951396 ] Jitendra Nath Pandey commented on HIVE-6752: Latest patch fixes the test. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6752: --- Status: Patch Available (was: Open) Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Status: Open (was: Patch Available) Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Attachment: HIVE-6662.2.patch Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6662) Vector Join operations with DATE columns fail
[ https://issues.apache.org/jira/browse/HIVE-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6662: --- Status: Patch Available (was: Open) Submitting same patch again for pre-commit tests. Vector Join operations with DATE columns fail - Key: HIVE-6662 URL: https://issues.apache.org/jira/browse/HIVE-6662 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Fix For: 0.13.0 Attachments: HIVE-6662.1.patch, HIVE-6662.2.patch, HIVE-6662.2.patch Trying to generate a DATE column as part of a JOIN's output throws an exception {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible Long vector column and primitive category DATE at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildObjectAssign(VectorColumnAssignFactory.java:306) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory.buildAssigners(VectorColumnAssignFactory.java:414) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:235) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:229) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:292) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6726) Hcat cli does not close SessionState
[ https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951433#comment-13951433 ] Sushanth Sowmyan commented on HIVE-6726: [~thejas]/[~hagleitn], can I bother either of you for a review for this? Hcat cli does not close SessionState Key: HIVE-6726 URL: https://issues.apache.org/jira/browse/HIVE-6726 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6726.patch When running HCat E2E tests, it was observed that hcat cli left Tez sessions on the RM which ultimately die upon timeout. Expected behavior is to clean the Tez sessions immediately upon exit. This is causing slowness in system tests as over time lot of orphan Tez sessions hang around. On looking through code, it seems obvious in retrospect because HCatCli starts a SessionState, but does not explicitly call close on them, exiting the jvm through System.exit instead. This needs to be changed to explicitly call SessionState.close() before exiting. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5677) Beeline warns about unavailable files if HIVE_OPTS is set
[ https://issues.apache.org/jira/browse/HIVE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951436#comment-13951436 ] Sushanth Sowmyan commented on HIVE-5677: This patch works for me, +1. [~xuefuz], I'm afraid I don't know about the remote debugging aspect, how do you normally do that? Beeline warns about unavailable files if HIVE_OPTS is set - Key: HIVE-5677 URL: https://issues.apache.org/jira/browse/HIVE-5677 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan Assignee: Navis Attachments: HIVE-5677.1.patch.txt NO PRECOMMIT TESTS This is similar to HIVE-5085. Beeline complains about files not existing if HIVE_OPTS are set. In the Beeline commandline sh as well, we should see if setting HIVE_OPTS to '' makes sense. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6633: --- Fix Version/s: 0.13.0 pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951449#comment-13951449 ] Sushanth Sowmyan commented on HIVE-6633: Committed to 0.13. Thanks Eric and Harish! pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HIVE-6758: -- Environment: (was: CDH4.5) Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.12.0 Reporter: Johndee Burks In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HIVE-6758: -- Affects Version/s: (was: 0.12.0) Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0 Reporter: Johndee Burks In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 28, 2014, 9:56 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs (updated) - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 2229079 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 5ebab70 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6738) HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary
[ https://issues.apache.org/jira/browse/HIVE-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951470#comment-13951470 ] Thejas M Nair commented on HIVE-6738: - +1 HiveServer2 secure Thrift/HTTP needs to accept doAs parameter from proxying intermediary Key: HIVE-6738 URL: https://issues.apache.org/jira/browse/HIVE-6738 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Attachments: HIVE-6738.1.patch, HIVE-6738.patch, hive-6738-req-impl-verify-rev1.md, hive-6738-req-impl-verify.md See already implemented JIra https://issues.apache.org/jira/browse/HIVE-5155 Support secure proxy user access to HiveServer2 That fix expects the hive.server2.proxy.user parameter to come in Thrift body. When an intermediary gateway like Apache Knox is authenticating the end client and then proxying the request to HiveServer2, it is not practical for the intermediary like Apache Knox to modify thrift content. Intermediary like Apache Knox should be able to assert doAs in a query parameter. This paradigm is already established by other Hadoop ecosystem components like WebHDFS, WebHCat, Oozie and HBase and Hive needs to be aligned with them. The doAs asserted in query parameter should override if doAs specified in Thrift body. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951480#comment-13951480 ] Hive QA commented on HIVE-6188: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12637478/HIVE-6188.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5498 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2018/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2018/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12637478 Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951481#comment-13951481 ] Sergey Shelukhin commented on HIVE-6752: +1 on most recent changes Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951483#comment-13951483 ] Jitendra Nath Pandey commented on HIVE-6188: +1 Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6188) Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl
[ https://issues.apache.org/jira/browse/HIVE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951488#comment-13951488 ] Sergey Shelukhin commented on HIVE-6188: Since it's a doc patch I will just commit later today Document hive.metastore.try.direct.sql hive.metastore.try.direct.sql.ddl -- Key: HIVE-6188 URL: https://issues.apache.org/jira/browse/HIVE-6188 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Lefty Leverenz Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6188.patch The hive.metastore.try.direct.sql and hive.metastore.try.direct.sql.ddl configuration properties need to be documented in hive-default.xml.template and the wiki. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs
[ https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951486#comment-13951486 ] Sushanth Sowmyan commented on HIVE-6592: Looks good to me from reading up man curl : {noformat} -k/--insecure (SSL) This option explicitly allows curl to perform insecure SSL connections and transfers. All SSL connections are attempted to be made secure by using the CA certificate bundle installed by default. This makes all connections considered insecure fail unless -k/--insecure is used. See this online resource for further details: http://curl.haxx.se/docs/sslcerts.html {noformat} +1. WebHCat E2E test abort when pointing to https url of webhdfs Key: HIVE-6592 URL: https://issues.apache.org/jira/browse/HIVE-6592 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.13.0 Attachments: HIVE-6592.patch WebHCat E2E tests when running against a ssl enabled webhdfs url fails. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6758) Beeline only works in interactive mode
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-6758: - Assignee: Xuefu Zhang Beeline only works in interactive mode -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0 Reporter: Johndee Burks Assignee: Xuefu Zhang In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs
[ https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6592: --- Fix Version/s: (was: 0.13.0) 0.14.0 WebHCat E2E test abort when pointing to https url of webhdfs Key: HIVE-6592 URL: https://issues.apache.org/jira/browse/HIVE-6592 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-6592.patch WebHCat E2E tests when running against a ssl enabled webhdfs url fails. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs
[ https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951492#comment-13951492 ] Sushanth Sowmyan commented on HIVE-6592: Committed to trunk, Thanks, Deepesh! Setting the fix version to 0.14. [~rhbutani], Deepesh would like to get this included in 0.13 as well. I think it makes sense for inclusion, since it's needed to allow our E2E tests to run in a secure environment. Could we backport this? WebHCat E2E test abort when pointing to https url of webhdfs Key: HIVE-6592 URL: https://issues.apache.org/jira/browse/HIVE-6592 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-6592.patch WebHCat E2E tests when running against a ssl enabled webhdfs url fails. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6758) Beeline doesn't work with -e option when started in background
[ https://issues.apache.org/jira/browse/HIVE-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6758: -- Summary: Beeline doesn't work with -e option when started in background (was: Beeline only works in interactive mode) Beeline doesn't work with -e option when started in background -- Key: HIVE-6758 URL: https://issues.apache.org/jira/browse/HIVE-6758 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0 Reporter: Johndee Burks Assignee: Xuefu Zhang In hive CLI you could easily integrate its use into a script and back ground the process like this: hive -e some query Beeline does not run when you do the same even with the -f switch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs
[ https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6592: --- Resolution: Fixed Status: Resolved (was: Patch Available) WebHCat E2E test abort when pointing to https url of webhdfs Key: HIVE-6592 URL: https://issues.apache.org/jira/browse/HIVE-6592 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-6592.patch WebHCat E2E tests when running against a ssl enabled webhdfs url fails. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package
[ https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951499#comment-13951499 ] Harish Butani commented on HIVE-6757: - Hi Justin, Brock, Couple of questions/thoughts: 1. What if we include the parquet-hive.jar in the hive-exec shaded jar? Does this mitigate the upgrade issues for existing users? 2. If they choose to how will existing users migrate to the new classes? Do we provide metadata upgrade scripts? Do we have to support their existing sql code: for e.g. we add checks in the hive parsing layer to replace old parquet class references with new classes. So the migration process when we remove(now or in the future) the deprecated classes is not clear. Can you guys please help me understand how this will play out. Remove deprecated parquet classes from outside of org.apache package Key: HIVE-6757 URL: https://issues.apache.org/jira/browse/HIVE-6757 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6757.patch, parquet-hive.patch Apache shouldn't release projects with files outside of the org.apache namespace. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6592) WebHCat E2E test abort when pointing to https url of webhdfs
[ https://issues.apache.org/jira/browse/HIVE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951517#comment-13951517 ] Harish Butani commented on HIVE-6592: - +1 for 0.13 WebHCat E2E test abort when pointing to https url of webhdfs Key: HIVE-6592 URL: https://issues.apache.org/jira/browse/HIVE-6592 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-6592.patch WebHCat E2E tests when running against a ssl enabled webhdfs url fails. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5677) Beeline warns about unavailable files if HIVE_OPTS is set
[ https://issues.apache.org/jira/browse/HIVE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951523#comment-13951523 ] Xuefu Zhang commented on HIVE-5677: --- [~sushanth] I was talking about debug options that usually set on HADOOP_OPTS. In hive script, HIVE_OPTS takes what HADOOP_OPTS sets, so debug can be enabled. Nevertheless, this patch is no longer necessary after HIVE-6652. Please try and let me know if the problem remains. Beeline warns about unavailable files if HIVE_OPTS is set - Key: HIVE-5677 URL: https://issues.apache.org/jira/browse/HIVE-5677 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan Assignee: Navis Attachments: HIVE-5677.1.patch.txt NO PRECOMMIT TESTS This is similar to HIVE-5085. Beeline complains about files not existing if HIVE_OPTS are set. In the Beeline commandline sh as well, we should see if setting HIVE_OPTS to '' makes sense. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: annotate_stats_part.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, bucket3.q.out, bucketmapjoin8.q.out, bucketmapjoin9.q.out, bucketmapjoin_negative.q.out, bucketmapjoin_negative2.q.out, columnstats_partlvl.q.out, filter_join_breaktask.q.out, groupby_map_ppr.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_ppr.q.out, groupby_sort_6.q.out, input23.q.out, input42.q.out, input_part1.q.out, input_part2.q.out, input_part7.q.out, input_part9.q.out, join26.q.out, join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int),
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: auto_sortmerge_join_2.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, bucket3.q.out, bucketmapjoin8.q.out, bucketmapjoin9.q.out, bucketmapjoin_negative.q.out, bucketmapjoin_negative2.q.out, columnstats_partlvl.q.out, filter_join_breaktask.q.out, groupby_map_ppr.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_map_ppr_multi_distinct.q.out, groupby_ppr.q.out, groupby_sort_6.q.out, input23.q.out, input42.q.out, input_part1.q.out, input_part2.q.out, input_part7.q.out, input_part9.q.out, join26.q.out, join32.q.out, join32_lessSize.q.out, join33.q.out, join9.q.out, join_map_ppr.q.out, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type:
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: bucket3.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: join_map_ppr.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: columnstats_partlvl.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: input_part9.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: bucketmapjoin8.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: groupby_map_ppr_multi_distinct.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: filter_join_breaktask.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: groupby_map_ppr.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: outer_join_ppr.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col9 (type: timestamp)
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: input_part2.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: groupby_ppr.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: bucketmapjoin_negative2.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: bucketmapjoin9.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: input23.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: join32_lessSize.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: groupby_sort_6.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: groupby_map_ppr_multi_distinct.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: input_part7.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: join32.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Reduce
[jira] [Updated] (HIVE-6642) Query fails to vectorize when a non string partition column is part of the query expression
[ https://issues.apache.org/jira/browse/HIVE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6642: Attachment: (was: bucketmapjoin_negative.q.out) Query fails to vectorize when a non string partition column is part of the query expression --- Key: HIVE-6642 URL: https://issues.apache.org/jira/browse/HIVE-6642 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.13.0 Attachments: HIVE-6642-2.patch, HIVE-6642-3.patch, HIVE-6642-4.patch, HIVE-6642.1.patch, load_dyn_part8.q.out, louter_join_ppr.q.out, merge3.q.out, metadataonly1.q.out, outer_join_ppr.q.out, pcr.q.out, ppd_vc.q.out, ppr_allchildsarenull.q.out, push_or.q.out, rand_partitionpruner2.q.out, rand_partitionpruner3.q.out, router_join_ppr.q.out, sample1.q.out, sample10.q.out, sample8.q.out, smb_mapjoin_11.q.out, sort_merge_join_desc_5.q.out, stats12.q.out, stats13.q.out, transform_ppr1.q.out, transform_ppr2.q.out, union_ppr.q.out drop table if exists alltypesorc_part; CREATE TABLE alltypesorc_part ( ctinyint tinyint, csmallint smallint, cint int, cbigint bigint, cfloat float, cdouble double, cstring1 string, cstring2 string, ctimestamp1 timestamp, ctimestamp2 timestamp, cboolean1 boolean, cboolean2 boolean) partitioned by (ds int) STORED AS ORC; insert overwrite table alltypesorc_part partition (ds=2011) select * from alltypesorc limit 100; insert overwrite table alltypesorc_part partition (ds=2012) select * from alltypesorc limit 200; explain select * from (select ds from alltypesorc_part) t1, alltypesorc t2 where t1.ds = t2.cint order by t2.ctimestamp1 limit 100; The above query fails to vectorize because (select ds from alltypesorc_part) t1 returns a string column and the join equality on t2 is performed on an int column. The correct output when vectorization is turned on should be: STAGE DEPENDENCIES: Stage-5 is a root stage Stage-2 depends on stages: Stage-5 Stage-0 is a root stage STAGE PLANS: Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1:alltypesorc_part Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1:alltypesorc_part TableScan alias: alltypesorc_part Statistics: Num rows: 300 Data size: 62328 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ds (type: int) outputColumnNames: _col0 Statistics: Num rows: 300 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE HashTable Sink Operator condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) Stage: Stage-2 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 3536 Data size: 1131711 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} 1 {ctinyint} {csmallint} {cint} {cbigint} {cfloat} {cdouble} {cstring1} {cstring2} {ctimestamp1} {ctimestamp2} {cboolean1} {cboolean2} keys: 0 _col0 (type: int) 1 cint (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 3889 Data size: 1244882 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col0 = _col3) (type: boolean) Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: tinyint), _col2 (type: smallint), _col3 (type: int), _col4 (type: bigint), _col5 (type: float), _col6 (type: double), _col7 (type: string), _col8 (type: string), _col\ 9 (type: timestamp), _col10 (type: timestamp), _col11 (type: boolean), _col12 (type: boolean) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12 Statistics: Num rows: 1944 Data size: 622280 Basic stats: COMPLETE Column stats: NONE