[jira] [Commented] (HIVE-5939) HCatalog hadoop-2 execution environment needs to be addressed.
[ https://issues.apache.org/jira/browse/HIVE-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838557#comment-13838557 ] Vikram Dixit K commented on HIVE-5939: -- Ah! yes. I missed HIVE-5897 thinking of HIVE-5894. Some of the required changes are there. Do you want to add some of these there or should I provide an incremental patch here considering that HIVE-5897 is ready to go in? I don't think HIVE-5897 is sufficient by itself. It needs the changes in this jira as well. Let me know. Thanks Vikram. HCatalog hadoop-2 execution environment needs to be addressed. -- Key: HIVE-5939 URL: https://issues.apache.org/jira/browse/HIVE-5939 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5939.1.patch Similar to HIVE-5755, we need to fix hcatalog's build to work with hadoop-2. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.3.patch Hi Brock, I have made the changes you suggested. The implications being that one needs to compile hive with the -Phadoop-1 or -Phadoop-2 profiles. Although all of the shims do get compiled against their respective versions of hadoop, the profile is needed by some classes such as HiveConf that depend on a version of hadoop for the Configuration class. The itests and qtest target require the profiles as is desirable. Please take a look and let me know. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.4.patch Addressed Brock's comments. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.4.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5755) Fix hadoop2 execution environment Milestone 1
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K reassigned HIVE-5755: Assignee: Vikram Dixit K (was: Brock Noland) Fix hadoop2 execution environment Milestone 1 - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Vikram Dixit K Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.4.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment Milestone 1
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833154#comment-13833154 ] Vikram Dixit K commented on HIVE-5755: -- When I set the patch available status, the build/ptest2 framework will need to change to use the -P flags. Shall I raise another jira for that? Thanks Vikram. Fix hadoop2 execution environment Milestone 1 - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Vikram Dixit K Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.4.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833194#comment-13833194 ] Vikram Dixit K commented on HIVE-5827: -- Bump [~brocknoland] [~navis] Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment Milestone 1
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833193#comment-13833193 ] Vikram Dixit K commented on HIVE-5755: -- Hmm.. I did not see anything with regard to adding the profile flags on that jira. Am I missing something? Thanks Vikram. Fix hadoop2 execution environment Milestone 1 - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Vikram Dixit K Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.4.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment Milestone 1
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833236#comment-13833236 ] Vikram Dixit K commented on HIVE-5755: -- Ah! I see. The mavenArgs would need to be updated with the -P flag for this patch to work. Thanks! Fix hadoop2 execution environment Milestone 1 - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Vikram Dixit K Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.3.patch, HIVE-5755.4.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5884) Mini tez cluster does not work after merging latest changes
Vikram Dixit K created HIVE-5884: Summary: Mini tez cluster does not work after merging latest changes Key: HIVE-5884 URL: https://issues.apache.org/jira/browse/HIVE-5884 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K After merging the maven changes from trunk to the tez branch, the mini tez tests do not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5884) Mini tez cluster does not work after merging latest changes
[ https://issues.apache.org/jira/browse/HIVE-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5884: - Status: Patch Available (was: Open) Mini tez cluster does not work after merging latest changes --- Key: HIVE-5884 URL: https://issues.apache.org/jira/browse/HIVE-5884 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5884.1.patch After merging the maven changes from trunk to the tez branch, the mini tez tests do not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5884) Mini tez cluster does not work after merging latest changes
[ https://issues.apache.org/jira/browse/HIVE-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5884: - Attachment: HIVE-5884.1.patch This patch requires the user to build hive using the -Phadoop-1 or -Phadoop-2 flags. The tez tests need to be run with the -Phadoop-2 flags as before. Mini tez cluster does not work after merging latest changes --- Key: HIVE-5884 URL: https://issues.apache.org/jira/browse/HIVE-5884 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5884.1.patch After merging the maven changes from trunk to the tez branch, the mini tez tests do not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Status: Patch Available (was: Open) Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Attachment: HIVE-5827.2.patch Re-upload. Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Status: Open (was: Patch Available) CI hasn't picked the previous patch. Will re-upload. Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch, HIVE-5827.2.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5862) While running some queries on large data using tez, we OOM.
Vikram Dixit K created HIVE-5862: Summary: While running some queries on large data using tez, we OOM. Key: HIVE-5862 URL: https://issues.apache.org/jira/browse/HIVE-5862 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Running out of memory while running map joins in tez on large data sets. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5862) While running some queries on large data using tez, we OOM.
[ https://issues.apache.org/jira/browse/HIVE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5862: - Attachment: HIVE-5862.1.patch Fixes couple of the leaks found. While running some queries on large data using tez, we OOM. --- Key: HIVE-5862 URL: https://issues.apache.org/jira/browse/HIVE-5862 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5862.1.patch Running out of memory while running map joins in tez on large data sets. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.2.patch Hi [~brocknoland], I did some more tweaking around with the maven flags and also developing a plan for how the dependencies should look. For the most part, things look right. Given that we package all the shims and choose one depending on the hadoop version that is available on the classpath, the dependencies within shims and the dependencies on the shims in other modules look right. The qtest profiles also include the right jars. However, the issue seems to be with the transitive dependencies being pulled in from the hive-it-util. Once I changed the hadoop and hbase dependencies in the hive-it-util target to optional, we get the behavior we expect. The profile flags seem to be taking effect in the right way now. Not sure what exactly changed but I did clear my .m2 cache a few times. Attaching a patch for reference. Please take a look and let me know what you think. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.1.patch Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5827) Incorrect location of logs for failed tests.
Vikram Dixit K created HIVE-5827: Summary: Incorrect location of logs for failed tests. Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5837.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Attachment: HIVE-5837.1.patch Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5837.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Status: Patch Available (was: Open) Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5837.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Attachment: HIVE-5827.1.patch Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5827: - Attachment: (was: HIVE-5837.1.patch) Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5827) Incorrect location of logs for failed tests.
[ https://issues.apache.org/jira/browse/HIVE-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822930#comment-13822930 ] Vikram Dixit K commented on HIVE-5827: -- https://reviews.apache.org/r/15543/ Incorrect location of logs for failed tests. - Key: HIVE-5827 URL: https://issues.apache.org/jira/browse/HIVE-5827 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5827.1.patch Extending HIVE-5790 to fix other tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822962#comment-13822962 ] Vikram Dixit K commented on HIVE-5755: -- Hi Brock, The issue here seems to be caused by the activeByDefault in the hadoop-1 maven profile which includes the hadoop-1 jars and shim jars in the test classpath even when -Phadoop-2 is used. I read the documentation on maven and it feels like what is done here is right but, when I remove those and use -Phadoop-2 (along with some changes to the way shims are included), I am able to see hadoop-2 tests being run correctly. I think that might be a maven bug. Also, I looked into the hbase maven build as well since they have hadoop-1 and hadoop-2 specific builds. Their approach is to use a -Dhadoop.profile on the maven command line to choose a particular profile. Their default is hadoop-1 as well controlled by the absence of the -Dhadoop.profile flag. Let me know your thoughts on this approach. I can quickly provide a patch for this. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822965#comment-13822965 ] Vikram Dixit K commented on HIVE-5755: -- HIVE-5749 seems like a duplicate of this. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823006#comment-13823006 ] Vikram Dixit K commented on HIVE-5755: -- Yeah. As I said, according to that it should have been deactivated but I think this might be a maven bug. I will upload a patch which works without a default profile. Maybe there is something else influencing things. But, I wasn't able to get the tests running correctly without removing the default profile flags. I can clean it up if you think this is the approach we should go with. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.try.patch Will need to build and run hive with -Phadoop-1/-Phadoop-2 to run the specific tests. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823022#comment-13823022 ] Vikram Dixit K commented on HIVE-5755: -- I am still learning about maven though. I was just trying some things out. Please let me know if you find something. It would greatly help improve my understanding as well. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823046#comment-13823046 ] Vikram Dixit K commented on HIVE-5755: -- So the effect is essentially that shims optionally depend on hadoop but ql which depends on shims would not have hadoop in its classpath while compiling. Wouldn't this cause a compilation failure because for e.g. the Configuration class would no longer be available in the classpath? Sorry if I am missing something. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823070#comment-13823070 ] Vikram Dixit K commented on HIVE-5755: -- Ah! I see. I will take that up. Raised HIVE-5828 for the same. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5828) Make shims dependecny on specific hadoop hive shims optional
Vikram Dixit K created HIVE-5828: Summary: Make shims dependecny on specific hadoop hive shims optional Key: HIVE-5828 URL: https://issues.apache.org/jira/browse/HIVE-5828 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K The issue now is that hive-shims depends on hive-shims-0.20, hive-shims-0.20S, and hive-shims-0.23. ql depends on hive-shims. When ql brings in hive-shims it brings it all transitive dependencies which include three different versions of hadoop. Since hive-shims should not bring any dependencies with it because we expect the end-user module to bring it's hadoop hadoop version. One way to do that is to mark all the hive-shims-* dependencies in hive-shims optional. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Patch Available (was: Open) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch, HIVE-5685.4.patch, HIVE-5685.5.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Attachment: HIVE-5685.5.patch Re-uploading patch for HiveQA again. partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch, HIVE-5685.4.patch, HIVE-5685.5.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Open (was: Patch Available) HiveQA has not picked up the previously uploaded patch. Will re-upload. partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch, HIVE-5685.4.patch, HIVE-5685.5.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Patch Available (was: Open) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch, HIVE-5685.4.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Attachment: HIVE-5685.4.patch Re-uploading patch for Hive QA. partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch, HIVE-5685.4.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816225#comment-13816225 ] Vikram Dixit K commented on HIVE-5685: -- I am able to successfully build hive with this patch. I am not sure if something is amiss in the HiveQA build environment. [~brocknoland] any input would be appreciated. Thanks Vikram. partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Patch Available (was: Open) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Attachment: HIVE-5685.3.patch partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Open (was: Patch Available) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch, HIVE-5685.3.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Patch Available (was: Open) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Attachment: HIVE-5685.2.patch Refreshed. partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch, HIVE-5685.2.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5703) While using tez, Qtest needs to close session before creating a new one
[ https://issues.apache.org/jira/browse/HIVE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5703: - Attachment: HIVE-5703.2.patch While using tez, Qtest needs to close session before creating a new one --- Key: HIVE-5703 URL: https://issues.apache.org/jira/browse/HIVE-5703 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5703.1.patch, HIVE-5703.2.patch While using the mini tez cluster, if we do not close the session, containers do not get freed up resulting in locking up of resources and hive times out. We need to ensure clean-up of session before new ones are launched in the Qtest framework. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5688) TestCliDriver compilation fails on tez branch.
[ https://issues.apache.org/jira/browse/HIVE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5688: - Attachment: HIVE-5688.2.patch Updated to address comments. valueOf cannot be over written for enums so implemented a new method for the same. TestCliDriver compilation fails on tez branch. -- Key: HIVE-5688 URL: https://issues.apache.org/jira/browse/HIVE-5688 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5688.1.patch, HIVE-5688.2.patch On the tez branch, the test cli driver tests fail to compile after HIVE-5543. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5703) While using tez, Qtest needs to close session before creating a new one
[ https://issues.apache.org/jira/browse/HIVE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5703: - Attachment: HIVE-5703.3.patch While using tez, Qtest needs to close session before creating a new one --- Key: HIVE-5703 URL: https://issues.apache.org/jira/browse/HIVE-5703 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5703.1.patch, HIVE-5703.2.patch, HIVE-5703.3.patch While using the mini tez cluster, if we do not close the session, containers do not get freed up resulting in locking up of resources and hive times out. We need to ensure clean-up of session before new ones are launched in the Qtest framework. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5703) While using tez, Qtest needs to close session before creating a new one
Vikram Dixit K created HIVE-5703: Summary: While using tez, Qtest needs to close session before creating a new one Key: HIVE-5703 URL: https://issues.apache.org/jira/browse/HIVE-5703 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K While using the mini tez cluster, if we do not close the session, containers do not get freed up resulting in locking up of resources and hive times out. We need to ensure clean-up of session before new ones are launched in the Qtest framework. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5703) While using tez, Qtest needs to close session before creating a new one
[ https://issues.apache.org/jira/browse/HIVE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5703: - Attachment: HIVE-5703.1.patch While using tez, Qtest needs to close session before creating a new one --- Key: HIVE-5703 URL: https://issues.apache.org/jira/browse/HIVE-5703 Project: Hive Issue Type: Bug Components: Testing Infrastructure, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5703.1.patch While using the mini tez cluster, if we do not close the session, containers do not get freed up resulting in locking up of resources and hive times out. We need to ensure clean-up of session before new ones are launched in the Qtest framework. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5543) Running the mini tez cluster for tez unit tests
[ https://issues.apache.org/jira/browse/HIVE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5543: - Attachment: HIVE-5543.3.patch Updated to address review comments. Running the mini tez cluster for tez unit tests --- Key: HIVE-5543 URL: https://issues.apache.org/jira/browse/HIVE-5543 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5543.1.patch, HIVE-5543.2.patch, HIVE-5543.3.patch In order to simulate the tez execution in hive tests, we need to work with MiniTezCluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808592#comment-13808592 ] Vikram Dixit K commented on HIVE-5685: -- https://reviews.apache.org/r/15069/ partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Attachment: HIVE-5685.1.patch partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5685) partition column type validation doesn't work in some cases
[ https://issues.apache.org/jira/browse/HIVE-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5685: - Status: Patch Available (was: Open) partition column type validation doesn't work in some cases --- Key: HIVE-5685 URL: https://issues.apache.org/jira/browse/HIVE-5685 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Vikram Dixit K Attachments: HIVE-5685.1.patch It seems like it works if there's more than one partition column, and doesn't work if there's just one. At least that's the case that I found. The situation for different types is the same. {noformat} hive create table zzz(c string) partitioned by (i int); OK Time taken: 0.41 seconds hive alter table zzz add partition (i='foo'); OK Time taken: 0.185 seconds hive create table (c string) partitioned by (i int,j int); OK Time taken: 0.085 seconds hive alter table add partition (i='foo',j=5); FAILED: SemanticException [Error 10248]: Cannot add partition column i of type string as it cannot be converted to type int hive alter table add partition (i=5,j='foo'); FAILED: SemanticException [Error 10248]: Cannot add partition column j of type string as it cannot be converted to type int {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5688) TestCliDriver compilation fails on tez branch.
Vikram Dixit K created HIVE-5688: Summary: TestCliDriver compilation fails on tez branch. Key: HIVE-5688 URL: https://issues.apache.org/jira/browse/HIVE-5688 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K On the tez branch, the test cli driver tests fail to compile after HIVE-5543. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5688) TestCliDriver compilation fails on tez branch.
[ https://issues.apache.org/jira/browse/HIVE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5688: - Attachment: HIVE-5688.1.patch TestCliDriver compilation fails on tez branch. -- Key: HIVE-5688 URL: https://issues.apache.org/jira/browse/HIVE-5688 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5688.1.patch On the tez branch, the test cli driver tests fail to compile after HIVE-5543. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5645) Cannot compile tests on tez branch
Vikram Dixit K created HIVE-5645: Summary: Cannot compile tests on tez branch Key: HIVE-5645 URL: https://issues.apache.org/jira/browse/HIVE-5645 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5646) Cannot compile tests on tez branch
Vikram Dixit K created HIVE-5646: Summary: Cannot compile tests on tez branch Key: HIVE-5646 URL: https://issues.apache.org/jira/browse/HIVE-5646 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Orc tests do not compile on the latest tez branch. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5646) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5646: - Assignee: (was: Vikram Dixit K) Cannot compile tests on tez branch -- Key: HIVE-5646 URL: https://issues.apache.org/jira/browse/HIVE-5646 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Orc tests do not compile on the latest tez branch. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5645) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5645: - Assignee: (was: Vikram Dixit K) Cannot compile tests on tez branch -- Key: HIVE-5645 URL: https://issues.apache.org/jira/browse/HIVE-5645 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5646) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K resolved HIVE-5646. -- Resolution: Duplicate Cannot compile tests on tez branch -- Key: HIVE-5646 URL: https://issues.apache.org/jira/browse/HIVE-5646 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Orc tests do not compile on the latest tez branch. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5645) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K reassigned HIVE-5645: Assignee: Vikram Dixit K Cannot compile tests on tez branch -- Key: HIVE-5645 URL: https://issues.apache.org/jira/browse/HIVE-5645 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5645) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5645: - Attachment: HIVE-5645.1.patch Cannot compile tests on tez branch -- Key: HIVE-5645 URL: https://issues.apache.org/jira/browse/HIVE-5645 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5645.1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5645) Cannot compile tests on tez branch
[ https://issues.apache.org/jira/browse/HIVE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5645: - Status: Patch Available (was: Open) Cannot compile tests on tez branch -- Key: HIVE-5645 URL: https://issues.apache.org/jira/browse/HIVE-5645 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5645.1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5647) Fix failing mapreduce tests on the tez branch
Vikram Dixit K created HIVE-5647: Summary: Fix failing mapreduce tests on the tez branch Key: HIVE-5647 URL: https://issues.apache.org/jira/browse/HIVE-5647 Project: Hive Issue Type: Bug Components: Tests, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Quite some tests on mapreduce are failing on the tez branch. This bug addresses those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5647) Fix failing mapreduce tests on the tez branch
[ https://issues.apache.org/jira/browse/HIVE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5647: - Attachment: HIVE-5647.1.patch This patch addresses several tests and results look clean. Fix failing mapreduce tests on the tez branch - Key: HIVE-5647 URL: https://issues.apache.org/jira/browse/HIVE-5647 Project: Hive Issue Type: Bug Components: Tests, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5647.1.patch Quite some tests on mapreduce are failing on the tez branch. This bug addresses those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5647) Fix failing mapreduce tests on the tez branch
[ https://issues.apache.org/jira/browse/HIVE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5647: - Status: Patch Available (was: Open) Fix failing mapreduce tests on the tez branch - Key: HIVE-5647 URL: https://issues.apache.org/jira/browse/HIVE-5647 Project: Hive Issue Type: Bug Components: Tests, Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5647.1.patch Quite some tests on mapreduce are failing on the tez branch. This bug addresses those. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5625) Fix issue with metastore version revision test.
Vikram Dixit K created HIVE-5625: Summary: Fix issue with metastore version revision test. Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Summary: Fix issue with metastore version restriction test. (was: Fix issue with metastore version revision test.) Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Status: Patch Available (was: Open) Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5625.1.patch Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5625) Fix issue with metastore version restriction test.
[ https://issues.apache.org/jira/browse/HIVE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5625: - Attachment: HIVE-5625.1.patch Fix issue with metastore version restriction test. -- Key: HIVE-5625 URL: https://issues.apache.org/jira/browse/HIVE-5625 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5625.1.patch Based on Brock's comments, the change made in HIVE-5403 change the nature of the test. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5633) Perflogger broken due to HIVE-5403
Vikram Dixit K created HIVE-5633: Summary: Perflogger broken due to HIVE-5403 Key: HIVE-5633 URL: https://issues.apache.org/jira/browse/HIVE-5633 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5633) Perflogger broken due to HIVE-5403
[ https://issues.apache.org/jira/browse/HIVE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5633: - Attachment: HIVE-5633.1.patch Perflogger broken due to HIVE-5403 -- Key: HIVE-5633 URL: https://issues.apache.org/jira/browse/HIVE-5633 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5633.1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803515#comment-13803515 ] Vikram Dixit K commented on HIVE-5403: -- Raised HIVE-5633 for the same and uploaded the simple fix to unblock. I think the issue is that the perf logger should not depend on the session state in backend at least. I think some clean up may be required in perf logger. Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0 Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5633) Perflogger broken due to HIVE-5403
[ https://issues.apache.org/jira/browse/HIVE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5633: - Attachment: HIVE-5633.2.patch Wrong diff previously. Perflogger broken due to HIVE-5403 -- Key: HIVE-5633 URL: https://issues.apache.org/jira/browse/HIVE-5633 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5633.1.patch, HIVE-5633.2.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5633) Perflogger broken due to HIVE-5403
[ https://issues.apache.org/jira/browse/HIVE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5633: - Status: Patch Available (was: Open) Perflogger broken due to HIVE-5403 -- Key: HIVE-5633 URL: https://issues.apache.org/jira/browse/HIVE-5633 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5633.1.patch, HIVE-5633.2.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5633) Perflogger broken due to HIVE-5403
[ https://issues.apache.org/jira/browse/HIVE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803579#comment-13803579 ] Vikram Dixit K commented on HIVE-5633: -- Review board request: https://reviews.apache.org/r/14892/ Perflogger broken due to HIVE-5403 -- Key: HIVE-5633 URL: https://issues.apache.org/jira/browse/HIVE-5633 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5633.1.patch, HIVE-5633.2.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Status: Open (was: Patch Available) Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Attachment: HIVE-5403.4.patch Fixed failing testcase. The test was failing because the show tables issued by the test was expected to fail upon creation of the metastore client (connection to metastore). However, since the metastore client is created upon session start by this patch, the testcase failed earlier than expected. Fixed it accordingly. Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Status: Patch Available (was: Open) Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch, HIVE-5403.4.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Attachment: HIVE-5506.2.patch Fix for failing test case. Golden file updated. Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Attachment: (was: HIVE-5506.2.patch) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Patch Available (was: Open) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Open (was: Patch Available) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Open (was: Patch Available) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Patch Available (was: Open) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch, HIVE-5506.2.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5543) Running the mini tez cluster for tez unit tests
[ https://issues.apache.org/jira/browse/HIVE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5543: - Attachment: HIVE-5543.1.patch Running the mini tez cluster for tez unit tests --- Key: HIVE-5543 URL: https://issues.apache.org/jira/browse/HIVE-5543 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5543.1.patch In order to simulate the tez execution in hive tests, we need to work with MiniTezCluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5543) Running the mini tez cluster for tez unit tests
[ https://issues.apache.org/jira/browse/HIVE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13795693#comment-13795693 ] Vikram Dixit K commented on HIVE-5543: -- https://reviews.apache.org/r/14651/ Running the mini tez cluster for tez unit tests --- Key: HIVE-5543 URL: https://issues.apache.org/jira/browse/HIVE-5543 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5543.1.patch In order to simulate the tez execution in hive tests, we need to work with MiniTezCluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Attachment: HIVE-5403.3.patch Updated to latest trunk. Eclipse has been broken since the merge with the vectorization branch. I will raise another JIRA for the same but with this patch I am able to successfully run tests on the command line/terminal. Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Status: Patch Available (was: Open) Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5403) Move loading of filesystem, ugi, metastore client to hive session
[ https://issues.apache.org/jira/browse/HIVE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5403: - Status: Open (was: Patch Available) Move loading of filesystem, ugi, metastore client to hive session - Key: HIVE-5403 URL: https://issues.apache.org/jira/browse/HIVE-5403 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-5403.1.patch, HIVE-5403.2.patch, HIVE-5403.3.patch As part of HIVE-5184, the metastore connection, loading filesystem were done as part of the tez session so as to speed up query times while paying a cost at startup. We can do this more generally in hive to apply to both the mapreduce and tez side of things. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Assignee: Vikram Dixit K Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Patch Available (was: Open) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Open (was: Patch Available) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Attachment: HIVE-5506.1.patch This should fix this issue. Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Attachment: (was: HIVE-5506.1.patch) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Status: Patch Available (was: Open) Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.11.0, 0.10.0, 0.9.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5506) Hive SPLIT function does not return array correctly
[ https://issues.apache.org/jira/browse/HIVE-5506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5506: - Attachment: HIVE-5506.1.patch Missed the input file. Hive SPLIT function does not return array correctly --- Key: HIVE-5506 URL: https://issues.apache.org/jira/browse/HIVE-5506 Project: Hive Issue Type: Bug Components: SQL, UDF Affects Versions: 0.9.0, 0.10.0, 0.11.0 Environment: Hive Reporter: John Omernik Assignee: Vikram Dixit K Attachments: HIVE-5506.1.patch Hello all, I think I have outlined a bug in the hive split function: Summary: When calling split on a string of data, it will only return all array items if the the last array item has a value. For example, if I have a string of text delimited by tab with 7 columns, and the first four are filled, but the last three are blank, split will only return a 4 position array. If any number of middle columns are empty, but the last item still has a value, then it will return the proper number of columns. This was tested in Hive 0.9 and hive 0.11. Data: (Note \t represents a tab char, \x09 the line endings should be \n (UNIX style) not sure what email will do to them). Basically my data is 7 lines of data with the first 7 letters separated by tab. On some lines I've left out certain letters, but kept the number of tabs exactly the same. input.txt a\tb\tc\td\te\tf\tg a\tb\tc\td\te\t\tg a\tb\t\td\t\tf\tg \t\t\td\te\tf\tg a\tb\tc\td\t\t\t a\t\t\t\te\tf\tg a\t\t\td\t\t\tg I then created a table with one column from that data: DROP TABLE tmp_jo_tab_test; CREATE table tmp_jo_tab_test (message_line STRING) STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/tmp/input.txt' OVERWRITE INTO TABLE tmp_jo_tab_test; Ok just to validate I created a python counting script: #!/usr/bin/python import sys for line in sys.stdin: line = line[0:-1] out = line.split(\t) print len(out) The output there is : $ cat input.txt |./cnt_tabs.py 7 7 7 7 7 7 7 Based on that information, split on tab should return me 7 for each line as well: hive -e select size(split(message_line, '\\t')) from tmp_jo_tab_test; 7 7 7 7 4 7 7 However it does not. It would appear that the line where only the first four letters are filled in(and blank is passed in on the last three) only returns 4 splits, where there should technically be 7, 4 for letters included, and three blanks. a\tb\tc\td\t\t\t -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5560) Hive produces incorrect results on multi-distinct query
Vikram Dixit K created HIVE-5560: Summary: Hive produces incorrect results on multi-distinct query Key: HIVE-5560 URL: https://issues.apache.org/jira/browse/HIVE-5560 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K {noformat} select key, count(distinct key) + count(distinct value) from src tablesample (10 ROWS) group by key POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here 165 1 val_165 1 238 1 val_238 1 255 1 val_255 1 27 1 val_27 1 278 1 val_278 1 311 1 val_311 1 409 1 val_409 1 484 1 val_484 1 86 1 val_86 1 98 1 val_98 1 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5560) Hive produces incorrect results on multi-distinct query
[ https://issues.apache.org/jira/browse/HIVE-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796293#comment-13796293 ] Vikram Dixit K commented on HIVE-5560: -- Discussions on this jira brought up this bug. Hive produces incorrect results on multi-distinct query --- Key: HIVE-5560 URL: https://issues.apache.org/jira/browse/HIVE-5560 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K {noformat} select key, count(distinct key) + count(distinct value) from src tablesample (10 ROWS) group by key POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here 165 1 val_165 1 238 1 val_238 1 255 1 val_255 1 27 1 val_27 1 278 1 val_278 1 311 1 val_311 1 409 1 val_409 1 484 1 val_484 1 86 1 val_86 1 98 1 val_98 1 {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5543) Running the mini tez cluster for tez unit tests
Vikram Dixit K created HIVE-5543: Summary: Running the mini tez cluster for tez unit tests Key: HIVE-5543 URL: https://issues.apache.org/jira/browse/HIVE-5543 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K In order to simulate the tez execution in hive tests, we need to work with MiniTezCluster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5516) Update hive to use updated tez APIs
[ https://issues.apache.org/jira/browse/HIVE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5516: - Status: Patch Available (was: Open) Update hive to use updated tez APIs --- Key: HIVE-5516 URL: https://issues.apache.org/jira/browse/HIVE-5516 Project: Hive Issue Type: Bug Components: Tez Affects Versions: tez-branch Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: tez-branch Attachments: HIVE-5516.1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)