[jira] [Updated] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4825: - Attachment: HIVE-4825.2.testfiles.patch HIVE-4825.2.code.patch Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708256#comment-13708256 ] Gunther Hagleitner commented on HIVE-4825: -- Ran tests on 1 2 line. Came back clean. Split in code + test, because there's lots of whitespace only diffs. I've also updated the review. Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4825: - Status: Patch Available (was: Open) Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3576) Regression: ALTER TABLE DROP IF EXISTS PARTITION throws a SemanticException if Partition is not found
[ https://issues.apache.org/jira/browse/HIVE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708263#comment-13708263 ] Kanwaljit Singh commented on HIVE-3576: --- Any updates on this issue? Regression: ALTER TABLE DROP IF EXISTS PARTITION throws a SemanticException if Partition is not found - Key: HIVE-3576 URL: https://issues.apache.org/jira/browse/HIVE-3576 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.9.0 Reporter: Harsh J Doing a simple {{ALTER TABLE testtable DROP IF EXISTS PARTITION(dt=NONEXISTENTPARTITION)}} fails with a SemanticException of the 10006 kind (INVALID_PARTITION). This does not respect the {{hive.exec.drop.ignorenonexistent}} condition either, since there are no if-check-wraps around this area, when fetching partitions from the store. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708274#comment-13708274 ] Gunther Hagleitner commented on HIVE-4518: -- .5 is rebased to trunk. Running tests. Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4518: - Attachment: HIVE-4518.5.patch Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708275#comment-13708275 ] Gunther Hagleitner commented on HIVE-4518: -- Updated: https://reviews.facebook.net/D10665 as well with the latest patch Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708276#comment-13708276 ] Gunther Hagleitner commented on HIVE-4388: -- That sounds like the best option. Are the 0.96 artifacts already available? HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708504#comment-13708504 ] Brock Noland commented on HIVE-4388: We'd be building against 0.95 initially until 0.96 was released. HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Brock Noland Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4675) Create new parallel unit test environment
[ https://issues.apache.org/jira/browse/HIVE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4675: --- Status: Open (was: Patch Available) Create new parallel unit test environment - Key: HIVE-4675 URL: https://issues.apache.org/jira/browse/HIVE-4675 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4675.patch The current ptest tool is great, but it has the following limitations: -Requires an NFS filer -Unless the NFS filer is dedicated ptests can become IO bound easily -Investigating of failures is troublesome because the source directory for the failure is not saved -Ignoring or isolated tests is not supported -No unit tests for the ptest framework exist It'd be great to have a ptest tool that addresses this limitations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4845) Correctness issue with MapJoins using the null safe operator
[ https://issues.apache.org/jira/browse/HIVE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4845: --- Attachment: HIVE-4845.patch Updated patch based on review. Cleaned up auto-gen'ed equals/hashcode. Correctness issue with MapJoins using the null safe operator Key: HIVE-4845 URL: https://issues.apache.org/jira/browse/HIVE-4845 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Priority: Critical Attachments: HIVE-4845.patch, HIVE-4845.patch I found a correctness issue while working on HIVE-4838. The following query from join_nullsafe.q gives different results depending on if it's executed map-side or reduce-side: {noformat} SELECT /*+ MAPJOIN(a) */ * FROM smb_input1 a JOIN smb_input1 b ON a.key = b.key AND a.value = b.value ORDER BY a.key, a.value, b.key, b.value; {noformat} For that query, on the map side, rows which should be joined are not. For example, the reduce side outputs this row: {noformat} a.key a.value b.key b.value 148 NULL 148 NULL {noformat} which makes sense since a.key is equal to b.key and a.value is equal to b.value but the current map-side code omits this row. The reason is that MapJoinDoubleKey is used for the map-side join which doesn't properly compare null values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4856) Upgrade HCat to 2.0.5-alpha
Brock Noland created HIVE-4856: -- Summary: Upgrade HCat to 2.0.5-alpha Key: HIVE-4856 URL: https://issues.apache.org/jira/browse/HIVE-4856 Project: Hive Issue Type: Task Reporter: Brock Noland In HIVE-4756 we upgraded Hive to 2.0.5-alpha. I see that HCat specifies it's deps differently. We should probably keep them on the same version of Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4856) Upgrade HCat to 2.0.5-alpha
[ https://issues.apache.org/jira/browse/HIVE-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4856: --- Component/s: HCatalog Upgrade HCat to 2.0.5-alpha --- Key: HIVE-4856 URL: https://issues.apache.org/jira/browse/HIVE-4856 Project: Hive Issue Type: Task Components: HCatalog Reporter: Brock Noland In HIVE-4756 we upgraded Hive to 2.0.5-alpha. I see that HCat specifies it's deps differently. We should probably keep them on the same version of Hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4721) Fix TestCliDriver.ptf_npath.q on 0.23
[ https://issues.apache.org/jira/browse/HIVE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4721: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! Fix TestCliDriver.ptf_npath.q on 0.23 - Key: HIVE-4721 URL: https://issues.apache.org/jira/browse/HIVE-4721 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4721.1.patch In HIVE-4717 I tried changing the last line of ptf_npath.q from: {noformat} where fl_num = 1142; {noformat} to: {noformat} where fl_num = 1142 order by origin_city_name, fl_num, year, month, day_of_month, sz, tpath; {noformat} in order to make the test deterministic. However this results, not just different order, in different results for 0.23 and 0.20S. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4853) junit timeout needs to be updated
[ https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4853: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! junit timeout needs to be updated - Key: HIVE-4853 URL: https://issues.apache.org/jira/browse/HIVE-4853 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4853.1.patch All the ptf, join etc tests we've added recently have pushed the junit time for TestCliDriver past the timout value on most machines (if run serially). The build machine uses it's own value - so you don't see it there. But when running locally it can be a real downer if you find out that the tests were aborted due to timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4854: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 - Key: HIVE-4854 URL: https://issues.apache.org/jira/browse/HIVE-4854 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4854.1.patch Problem is with mkdir command. It tries to generate multiple directories at once without the right flag. That works only on the 1 line. Simple fix to the test should do the trick -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4852) -Dbuild.profile=core fails
[ https://issues.apache.org/jira/browse/HIVE-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4852: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! -Dbuild.profile=core fails -- Key: HIVE-4852 URL: https://issues.apache.org/jira/browse/HIVE-4852 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4852.1.patch Core profile fails because of an added chmod to some hcat files. Simple fix: Check if modules contains hcat before running the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4853) junit timeout needs to be updated
[ https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-4853: -- Assignee: Ashutosh Chauhan (was: Gunther Hagleitner) junit timeout needs to be updated - Key: HIVE-4853 URL: https://issues.apache.org/jira/browse/HIVE-4853 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Ashutosh Chauhan Fix For: 0.12.0 Attachments: HIVE-4853.1.patch All the ptf, join etc tests we've added recently have pushed the junit time for TestCliDriver past the timout value on most machines (if run serially). The build machine uses it's own value - so you don't see it there. But when running locally it can be a real downer if you find out that the tests were aborted due to timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4853) junit timeout needs to be updated
[ https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4853: --- Assignee: Gunther Hagleitner (was: Ashutosh Chauhan) junit timeout needs to be updated - Key: HIVE-4853 URL: https://issues.apache.org/jira/browse/HIVE-4853 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4853.1.patch All the ptf, join etc tests we've added recently have pushed the junit time for TestCliDriver past the timout value on most machines (if run serially). The build machine uses it's own value - so you don't see it there. But when running locally it can be a real downer if you find out that the tests were aborted due to timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4854: --- Assignee: Gunther Hagleitner (was: Ashutosh Chauhan) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 - Key: HIVE-4854 URL: https://issues.apache.org/jira/browse/HIVE-4854 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4854.1.patch Problem is with mkdir command. It tries to generate multiple directories at once without the right flag. That works only on the 1 line. Simple fix to the test should do the trick -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-4854: -- Assignee: Ashutosh Chauhan (was: Gunther Hagleitner) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 - Key: HIVE-4854 URL: https://issues.apache.org/jira/browse/HIVE-4854 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Ashutosh Chauhan Fix For: 0.12.0 Attachments: HIVE-4854.1.patch Problem is with mkdir command. It tries to generate multiple directories at once without the right flag. That works only on the 1 line. Simple fix to the test should do the trick -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4857) Hive tests are leaving slop artifacts all over the project
Edward Capriolo created HIVE-4857: - Summary: Hive tests are leaving slop artifacts all over the project Key: HIVE-4857 URL: https://issues.apache.org/jira/browse/HIVE-4857 Project: Hive Issue Type: Task Reporter: Edward Capriolo We used to have a project that would build temporary artifacts in temporary directories. Now runs of tests leave stuff all over the place. Making it hard to work with the project. {quote} [edward@jackintosh hive-trunk2]$ svn stat | more ? common/src/gen ? contrib/TempStatsStore ? contrib/derby.log ? data/files/local_array_table_1 ? data/files/local_array_table_2 ? data/files/local_array_table_2_withfields ? data/files/local_array_table_3 ? data/files/local_map_table_1 ? data/files/local_map_table_2 ? data/files/local_map_table_2_withfields ? data/files/local_map_table_3 ? data/files/local_rctable ? data/files/local_rctable_out ? data/files/local_src_table_1 ? data/files/local_src_table_2 ? hbase-handler/TempStatsStore ? hbase-handler/build ? hbase-handler/derby.log ? hcatalog/build ? hcatalog/core/build {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3632) datanucleus breaks when using JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-3632: - Assignee: Xuefu Zhang datanucleus breaks when using JDK7 -- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.1, 0.10.0 Reporter: Chris Drome Assignee: Xuefu Zhang Priority: Critical I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708639#comment-13708639 ] Xuefu Zhang commented on HIVE-3632: --- To compile and run hive on JDK7, DataNucleus needs to be upgraded. The current plan is to upgrade using the following library versions: datanucleus-api-jdo-3.2.1.jar datanucleus-rdbms-3.2.1.jar datanucleus-core-3.2.2.jar. These versions work for both JDK6 and JDK7. After upgrade, there is only a few test failures with JDK6. Besides the unit tests, more tests will be conducted. This is related to HIVE-2084, but the goal here is slightly different. Of course, the upgrade needs to address all issues that may arise. Upgrade datanucleus to support JDK7 --- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.1, 0.10.0 Reporter: Chris Drome Assignee: Xuefu Zhang Priority: Critical I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708637#comment-13708637 ] Xuefu Zhang commented on HIVE-3632: --- Since nobody is working on this, I will give it a shot. Upgrade datanucleus to support JDK7 --- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.1, 0.10.0 Reporter: Chris Drome Assignee: Xuefu Zhang Priority: Critical I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-4825: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708656#comment-13708656 ] Edward Capriolo commented on HIVE-4825: --- Also to not commit on trunk if this is only a tez supporting patch. There is a tez branch. Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Minor Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork
[ https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708654#comment-13708654 ] Edward Capriolo commented on HIVE-4825: --- This issues is NOT a bug. It's priority is not major. Separate MapredWork into MapWork and ReduceWork --- Key: HIVE-4825 URL: https://issues.apache.org/jira/browse/HIVE-4825 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, HIVE-4825.2.testfiles.patch Right now all the information needed to run an MR job is captured in MapredWork. This class has aliases, tagging info, table descriptors etc. For Tez and MRR it will be useful to break this into map and reduce specific pieces. The separation is natural and I think has value in itself, it makes the code easier to understand. However, it will also allow us to reuse these abstractions in Tez where you'll have a graph of these instead of just 1M and 0-1R. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4858) Sort show grant result to improve usability and testability
Xuefu Zhang created HIVE-4858: - Summary: Sort show grant result to improve usability and testability Key: HIVE-4858 URL: https://issues.apache.org/jira/browse/HIVE-4858 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.11.0, 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.11.1 Currently Hive outputs the result of show grant command in no deterministic order. It outputs the set of each privilege type in the order of whatever returned from DB (DataNucleus). Randomness can arise and tests (depending on the order) can fail, especially in events of library upgrade (DN or JVM upgrade). Sorting the result will avoid the potential randomness and make the output more deterministic, thus not only improving the readability of the output but also making the test more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708673#comment-13708673 ] Ashutosh Chauhan commented on HIVE-3632: I am all for updating DN. Huge +1 But we need to be wary of https://issues.apache.org/jira/browse/HIVE-2084?focusedCommentId=13014240page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13014240 Feels like we might need to provide upgrade scripts for folks to migrate because of this. Upgrade datanucleus to support JDK7 --- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.1, 0.10.0 Reporter: Chris Drome Assignee: Xuefu Zhang Priority: Critical I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4317) StackOverflowError when add jar concurrently
[ https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708675#comment-13708675 ] Brock Noland commented on HIVE-4317: Where is the SOE occurring? I.e. jdbc client, HS1, HS2, etc? Please add a review item for this patch, this is described under Review Process here: https://cwiki.apache.org/confluence/display/Hive/HowToContribute StackOverflowError when add jar concurrently - Key: HIVE-4317 URL: https://issues.apache.org/jira/browse/HIVE-4317 Project: Hive Issue Type: Bug Affects Versions: 0.9.0, 0.10.0 Reporter: wangwenli Attachments: hive-4317.1.patch scenario: multiple thread add jar and do select operation by jdbc concurrently , when hiveserver serializeMapRedWork sometimes, it will throw StackOverflowError from XMLEncoder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4730) Join on more than 2^31 records on single reducer failed (wrong results)
[ https://issues.apache.org/jira/browse/HIVE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708691#comment-13708691 ] Phabricator commented on HIVE-4730: --- brock has commented on the revision HIVE-4730 [jira] Join on more than 2^31 records on single reducer failed (wrong results). Hi Navis, Thanks for the patch! I noted a few style nits. Just curious, how long did the query take to complete? My guess is far too long to have a q-file test for this. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:286 Is it possible to move this up near the rest of the member variable definitions? Ideally it'd be nice to change the LHS to be List but it's possible that something in the class requires ArrayList. REVISION DETAIL https://reviews.facebook.net/D11283 To: JIRA, navis Cc: brock Join on more than 2^31 records on single reducer failed (wrong results) --- Key: HIVE-4730 URL: https://issues.apache.org/jira/browse/HIVE-4730 Project: Hive Issue Type: Bug Affects Versions: 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0 Reporter: Gabi Kazav Assignee: Navis Priority: Blocker Attachments: HIVE-4730.D11283.1.patch join on more than 2^31 rows leads to wrong results. for example: Create table small_table (p1 string) ROW FORMAT DELIMITEDLINES TERMINATED BY '\n'; Create table big_table (p1 string) ROW FORMAT DELIMITEDLINES TERMINATED BY '\n'; Loading 1 row to small_table (the value 1). Loading 2149580800 rows to big_table with the same value (1 on this case). create table output as select a.p1 from big_table a join small_table b on (a.p1=b.p1); select count(*) from output ; will return only 1 row... the reducer syslog: ... 2013-06-13 17:20:59,254 INFO ExecReducer: ExecReducer: processing 214700 rows: used memory = 32925960 2013-06-13 17:21:00,745 INFO ExecReducer: ExecReducer: processing 214800 rows: used memory = 12815184 2013-06-13 17:21:02,205 INFO ExecReducer: ExecReducer: processing 214900 rows: used memory = 26684552 -- looks like wrong value.. ... 2013-06-13 17:21:04,062 INFO ExecReducer: ExecReducer: processed 2149580801 rows: used memory = 17715896 2013-06-13 17:21:04,062 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 finished. closing... 2013-06-13 17:21:04,062 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarded 1 rows 2013-06-13 17:21:05,791 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: SKEWJOINFOLLOWUPJOBS:0 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 finished. closing... 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 forwarded 1 rows 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 6 finished. closing... 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 6 forwarded 0 rows 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:1 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 Close done 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 Close done -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4831) QTestUtil based test exiting abnormally on windows fails startup of other QTestUtil tests
[ https://issues.apache.org/jira/browse/HIVE-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708694#comment-13708694 ] Brock Noland commented on HIVE-4831: Hi, As opposed to choosing a random directory, I think we should have a little utility method that ensures we are creating a directory which does not exist. Guava's Files.createTmpDir() is google example. Brock QTestUtil based test exiting abnormally on windows fails startup of other QTestUtil tests - Key: HIVE-4831 URL: https://issues.apache.org/jira/browse/HIVE-4831 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4831.1.patch QTestUtil tests start mini zookeeper cluster. If it exits abnormally (eg timeout), it fails to stop the zookeeper mini cluster. On Windows when the process is still running the files can't be deleted, and as a result the new zookeeper cluster started by a new QFileUtil based test case fails to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4796) Increase coverage of package org.apache.hadoop.hive.common.metrics
[ https://issues.apache.org/jira/browse/HIVE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708731#comment-13708731 ] Hudson commented on HIVE-4796: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4796 : Increase coverage of package org.apache.hadoop.hive.common.metrics (Ivan Veselovsky via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501052) * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/metrics/MetricsMBean.java * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/metrics/MetricsMBeanImpl.java * /hive/trunk/common/src/test/org/apache/hadoop/hive/common * /hive/trunk/common/src/test/org/apache/hadoop/hive/common/metrics * /hive/trunk/common/src/test/org/apache/hadoop/hive/common/metrics/TestMetrics.java Increase coverage of package org.apache.hadoop.hive.common.metrics -- Key: HIVE-4796 URL: https://issues.apache.org/jira/browse/HIVE-4796 Project: Hive Issue Type: Test Affects Versions: 0.12.0 Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Fix For: 0.12.0 Attachments: HIVE-4796-trunk--N1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4814) Adjust WebHCat e2e tests until HIVE-4703 is addressed
[ https://issues.apache.org/jira/browse/HIVE-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708722#comment-13708722 ] Hudson commented on HIVE-4814: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4814 : Adjust WebHCat e2e tests until HIVE4703 is addressed (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500312) * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/ddl.conf Adjust WebHCat e2e tests until HIVE-4703 is addressed - Key: HIVE-4814 URL: https://issues.apache.org/jira/browse/HIVE-4814 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE-4814.patch right now a number of e2e webhcat test cases fail due to HIVE-4703. This issue in that bug has been around for a long time and the fix is not quick. We need to adjust expected e2e results until HIVE-4703 is fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4807) Hive metastore hangs
[ https://issues.apache.org/jira/browse/HIVE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708710#comment-13708710 ] Hudson commented on HIVE-4807: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4807 : Hive metastore hangs (Sarvesh Sakalanaga via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501675) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ivy/libraries.properties * /hive/trunk/jdbc/build.xml * /hive/trunk/metastore/ivy.xml Hive metastore hangs Key: HIVE-4807 URL: https://issues.apache.org/jira/browse/HIVE-4807 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Sarvesh Sakalanaga Assignee: Sarvesh Sakalanaga Fix For: 0.12.0 Attachments: Hive-4807.0.patch, Hive-4807.1.patch, Hive-4807.2.patch Hive metastore hangs (does not accept any new connections) due to a bug in DBCP. The root cause analysis is here https://issues.apache.org/jira/browse/DBCP-398. The fix is to change Hive connection pool to BoneCP which is natively supported by DataNucleus. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4805) Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors
[ https://issues.apache.org/jira/browse/HIVE-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708723#comment-13708723 ] Hudson commented on HIVE-4805: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4805 : Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors (Ivan Veselovsky via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500449) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/DataCorruptErrorHeuristic.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/ErrorAndSolution.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/errors * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/errors/TestTaskLogProcessor.java Enhance coverage of package org.apache.hadoop.hive.ql.exec.errors - Key: HIVE-4805 URL: https://issues.apache.org/jira/browse/HIVE-4805 Project: Hive Issue Type: Test Affects Versions: 0.12.0 Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Fix For: 0.12.0 Attachments: HIVE-4805-trunk--N2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4810) Refactor exec package
[ https://issues.apache.org/jira/browse/HIVE-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708719#comment-13708719 ] Hudson commented on HIVE-4810: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4810 [jira] Refactor exec package (Gunther Hagleitner via Ashutosh Chauhan) Summary: HIVE-4810 The exec package contains both operators and classes used to execute the job. Moving the latter into a sub package makes the package slightly more manageable and will make it easier to provide a tez-based implementation. Test Plan: Refactoring Reviewers: ashutoshc Reviewed By: ashutoshc Differential Revision: https://reviews.facebook.net/D11625 (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501476) * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/contrib/src/test/results/clientnegative/case_with_row_sequence.q.out * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapperContext.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHook.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobTrackerURLResolver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredContext.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Throttle.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapperContext.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHook.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JobDebugger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JobTrackerURLResolver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/Throttle.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputSplit.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveRecordReader.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveRecordReader.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/stats/PartialScanTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/truncate/ColumnTruncateTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SamplingOptimizer.java *
[jira] [Commented] (HIVE-4251) Indices can't be built on tables whose schema info comes from SerDe
[ https://issues.apache.org/jira/browse/HIVE-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708712#comment-13708712 ] Hudson commented on HIVE-4251: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4251 : Indices can't be built on tables whose schema info comes from SerDe (Mark Wagner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500452) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java * /hive/trunk/ql/src/test/queries/clientpositive/index_serde.q * /hive/trunk/ql/src/test/results/clientpositive/index_serde.q.out Indices can't be built on tables whose schema info comes from SerDe --- Key: HIVE-4251 URL: https://issues.apache.org/jira/browse/HIVE-4251 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.10.1, 0.11.0 Reporter: Mark Wagner Assignee: Mark Wagner Fix For: 0.12.0 Attachments: HIVE-4251.1.patch, HIVE-4251.2.patch Building indices on tables who get the schema information from the deserializer (e.g. Avro backed tables) doesn't work because when the column is checked to exist, the correct API isn't used. {code} hive describe doctors; OK # col_namedata_type comment numberint from deserializer first_namestring from deserializer last_name string from deserializer Time taken: 0.215 seconds, Fetched: 5 row(s) hive create index doctors_index on table doctors(number) as 'compact' with deferred rebuild; FAILED: Error in metadata: java.lang.RuntimeException: Check the index columns, they should appear in the table being indexed. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4580) Change DDLTask to report errors using canonical error messages rather than http status codes
[ https://issues.apache.org/jira/browse/HIVE-4580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708711#comment-13708711 ] Hudson commented on HIVE-4580: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4580 : Change DDLTask to report errors using canonical error messages rather than http status codes (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501053) * /hive/trunk/contrib/src/test/results/clientnegative/serde_regex.q.out * /hive/trunk/contrib/src/test/results/clientnegative/url_hook.q.out * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/ddl.conf * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/AppConfig.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/BadParam.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/BusyException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/CallbackFailedException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/CatchallExceptionMapper.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/HcatDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/HcatException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/JsonBuilder.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/Main.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/NotAuthorizedException.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/QueueException.java * /hive/trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hcatalog/templeton/TestWebHCatE2e.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskResult.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveException.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/InvalidTableException.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java * /hive/trunk/ql/src/test/results/clientnegative/add_partition_with_whitelist.q.out * /hive/trunk/ql/src/test/results/clientnegative/addpart1.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_partition_nodrop_table.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_partition_with_whitelist.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure2.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure3.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_table_wrong_regex.q.out * /hive/trunk/ql/src/test/results/clientnegative/alter_view_failure4.q.out * /hive/trunk/ql/src/test/results/clientnegative/altern1.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive1.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive2.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi1.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi2.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi3.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi4.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi5.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi6.q.out * /hive/trunk/ql/src/test/results/clientnegative/archive_multi7.q.out * /hive/trunk/ql/src/test/results/clientnegative/authorization_fail_1.q.out * /hive/trunk/ql/src/test/results/clientnegative/column_rename1.q.out * /hive/trunk/ql/src/test/results/clientnegative/column_rename2.q.out * /hive/trunk/ql/src/test/results/clientnegative/column_rename4.q.out * /hive/trunk/ql/src/test/results/clientnegative/create_table_failure3.q.out * /hive/trunk/ql/src/test/results/clientnegative/create_table_failure4.q.out *
[jira] [Commented] (HIVE-3691) TestDynamicSerDe failed with IBM JDK
[ https://issues.apache.org/jira/browse/HIVE-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708730#comment-13708730 ] Hudson commented on HIVE-3691: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-3691 : TestDynamicSerDe failed with IBM JDK (Bing Li Renata Ghisloti via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501687) * /hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/dynamic_type/TestDynamicSerDe.java TestDynamicSerDe failed with IBM JDK Key: HIVE-3691 URL: https://issues.apache.org/jira/browse/HIVE-3691 Project: Hive Issue Type: Bug Affects Versions: 0.7.1, 0.8.0, 0.9.0 Environment: ant-1.8.2, IBM JDK 1.6 Reporter: Bing Li Assignee: Bing Li Priority: Minor Fix For: 0.12.0 Attachments: HIVE-3691.1.patch-trunk.txt, HIVE-3691.1.patch.txt the order of the output in the gloden file are different from JDKs. the root cause of this is the implementation of HashMap in JDK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4290) Build profiles: Partial builds for quicker dev
[ https://issues.apache.org/jira/browse/HIVE-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708728#comment-13708728 ] Hudson commented on HIVE-4290: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4290 : Build profiles: Partial builds for quicker dev (Gunther Hagleitner via Navis) (navis: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1502760) * /hive/trunk/build.properties * /hive/trunk/build.xml * /hive/trunk/ql/build.xml * /hive/trunk/ql/ivy.xml Build profiles: Partial builds for quicker dev -- Key: HIVE-4290 URL: https://issues.apache.org/jira/browse/HIVE-4290 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4290.2.patch, HIVE-4290.D11481.1.patch, HIVE-4290.patch Building is definitely taking longer with hcat, hs2 etc in the build. When you're working on one area of the system though, it would be easier to have an option to only build that. Not for pre-commit or build machines, but for dev this should help. ant clean package build OR ant -Dbuild.profile=full clean package test -- build everything ant -Dbuild.profile=core clean package test -- build just enough to run the tests in ql ant -Dbuild.profile=hcat clean package test -- build only hcatalog -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4830) Test clientnegative/nested_complex_neg.q got broken due to 4580
[ https://issues.apache.org/jira/browse/HIVE-4830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708714#comment-13708714 ] Hudson commented on HIVE-4830: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4830 : Test clientnegative/nested_complex_neg.q got broken due to 4580 (Vikram Dixit via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501616) * /hive/trunk/ql/src/test/results/clientnegative/nested_complex_neg.q.out Test clientnegative/nested_complex_neg.q got broken due to 4580 --- Key: HIVE-4830 URL: https://issues.apache.org/jira/browse/HIVE-4830 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Fix For: 0.12.0 Attachments: HIVE-4830.patch Both HIVE-3253 and HIVE-4580 were racing to modify .q.out files for this test. Eventually, one patch lost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3475) INLINE UDTF doesn't convert types properly
[ https://issues.apache.org/jira/browse/HIVE-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708726#comment-13708726 ] Hudson commented on HIVE-3475: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-3475 INLINE UDTF does not convert types properly (Igor Kabiljo and Navis Ryu via egc) Submitted by: Navis Ryu and Igor Kabiljo Reviewed by:Edward Capriolo (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500531) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFInline.java * /hive/trunk/ql/src/test/queries/clientpositive/udf_inline.q * /hive/trunk/ql/src/test/results/clientpositive/udf_inline.q.out INLINE UDTF doesn't convert types properly -- Key: HIVE-3475 URL: https://issues.apache.org/jira/browse/HIVE-3475 Project: Hive Issue Type: Bug Components: UDF Reporter: Igor Kabiljo Assignee: Navis Priority: Minor Fix For: 0.12.0 Attachments: HIVE-3475.D7461.1.patch I suppose the issue is in line: this.forwardObj [ i ] = res.convertIfNecessary(rowList.get( i ), f.getFieldObjectInspector()); there is never reason for conversion, it should just be: this.forwardObj [ i ] = rowList.get( i ) Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to java.lang.Long at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:203) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427) at org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.serialize(ColumnarSerDe.java:169) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:569) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112) at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44) at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81) at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:63) at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4733) HiveLockObjectData is not compared properly
[ https://issues.apache.org/jira/browse/HIVE-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708725#comment-13708725 ] Hudson commented on HIVE-4733: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4733 : HiveLockObjectData is not compared properly (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500569) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java HiveLockObjectData is not compared properly --- Key: HIVE-4733 URL: https://issues.apache.org/jira/browse/HIVE-4733 Project: Hive Issue Type: Bug Components: Locking Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4733.D11277.1.patch, HIVE-4733.D11277.2.patch, HIVE-4733.D11277.3.patch {noformat} ret = ret (clientIp == null) ? target.getClientIp() == null : clientIp.equals(target.getClientIp()); {noformat} seemed intended to be {noformat} ret = ret (clientIp == null ? target.getClientIp() == null : clientIp.equals(target.getClientIp())); {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3810) HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile
[ https://issues.apache.org/jira/browse/HIVE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708729#comment-13708729 ] Hudson commented on HIVE-3810: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-3810 : HiveHistory.log need to replace \r with space before writing Entry.value to historyfile (Mark Grover via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500991) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java HiveHistory.log need to replace '\r' with space before writing Entry.value to historyfile - Key: HIVE-3810 URL: https://issues.apache.org/jira/browse/HIVE-3810 Project: Hive Issue Type: Bug Components: Logging Reporter: qiangwang Assignee: Mark Grover Priority: Minor Fix For: 0.12.0 Attachments: HIVE-3810.1.patch, HIVE-3810.2.patch, HIVE-3810.3.patch, HIVE-3810.4.patch HiveHistory.log will replace '\n' with space before writing Entry.value to history file: val = val.replace('\n', ' '); but HiveHistory.parseHiveHistory use BufferedReader.readLine which takes '\n', '\r', '\r\n' as line delimiter to parse history file if val contains '\r', there is a high possibility that HiveHistory.parseLine will fail, in which case usually RecordTypes.valueOf(recType) will throw exception 'java.lang.IllegalArgumentException' HiveHistory.log need to replace '\r' with space as well: val = val.replace('\n', ' '); changed to val = val.replaceAll(\r|\n, ); or val = val.replace('\r', ' ').replace('\n', ' '); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4833) Fix eclipse template classpath to include the correct jdo lib
[ https://issues.apache.org/jira/browse/HIVE-4833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708727#comment-13708727 ] Hudson commented on HIVE-4833: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4833 : Fix eclipse template classpath to include the correct jdo lib (Yin Huai via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501618) * /hive/trunk/eclipse-templates/.classpath * /hive/trunk/eclipse-templates/.classpath._hbase Fix eclipse template classpath to include the correct jdo lib - Key: HIVE-4833 URL: https://issues.apache.org/jira/browse/HIVE-4833 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4833.patch.txt HIVE-4089 upgraded jdo to 3.0.1, but .classpath and classpath._hbase in eclipse tempalte has not been changed accordingly -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4802) Fix url check for missing / or /db after hostname in jdb uri
[ https://issues.apache.org/jira/browse/HIVE-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708721#comment-13708721 ] Hudson commented on HIVE-4802: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4802 : Fix url check for missing / or /db after hostname in jdb uri (Thejas Nair via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500781) * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/Utils.java * /hive/trunk/jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java Fix url check for missing / or /db after hostname in jdb uri - Key: HIVE-4802 URL: https://issues.apache.org/jira/browse/HIVE-4802 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4802.1.patch HIVE-4406 added a check for jdbc uri to prevent unintentional use of embedded mode. But that does not correctly check for uri like jdbc:hive2://localhost:1;principal=hive/hiveserver2h...@your-realm.com that can also result in embedded mode being used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4813) Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr
[ https://issues.apache.org/jira/browse/HIVE-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708718#comment-13708718 ] Hudson commented on HIVE-4813: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4813 : Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr (Ivan Veselovsky via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501099) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultRuleDispatcher.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcCtx.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java * /hive/trunk/ql/src/test/queries/clientpositive/pcr.q * /hive/trunk/ql/src/test/results/clientpositive/pcr.q.out Improve test coverage of package org.apache.hadoop.hive.ql.optimizer.pcr Key: HIVE-4813 URL: https://issues.apache.org/jira/browse/HIVE-4813 Project: Hive Issue Type: Test Affects Versions: 0.12.0 Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky Fix For: 0.12.0 Attachments: HIVE-4813-trunk--N1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4658) Make KW_OUTER optional in outer joins
[ https://issues.apache.org/jira/browse/HIVE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708716#comment-13708716 ] Hudson commented on HIVE-4658: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4658 : Make KW_OUTER optional in outer joins (Edward Capriolo via Navis) (navis: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1502758) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g * /hive/trunk/ql/src/test/queries/clientpositive/optional_outer.q * /hive/trunk/ql/src/test/results/clientpositive/optional_outer.q.out Make KW_OUTER optional in outer joins - Key: HIVE-4658 URL: https://issues.apache.org/jira/browse/HIVE-4658 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Edward Capriolo Priority: Trivial Fix For: 0.12.0 Attachments: hive-4658.2.patch.txt, HIVE-4658.D11091.1.patch For really trivial migration issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4811) (Slightly) break up the SemanticAnalyzer monstrosity
[ https://issues.apache.org/jira/browse/HIVE-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708720#comment-13708720 ] Hudson commented on HIVE-4811: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4811 : (Slightly) break up the SemanticAnalyzer monstrosity (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1500375) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java (Slightly) break up the SemanticAnalyzer monstrosity Key: HIVE-4811 URL: https://issues.apache.org/jira/browse/HIVE-4811 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4811.1.patch 11000 lines and counting. Separating genMRTasks into it's own unit will only make a small dent, but will definitely help maintaining this beast. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4819) Comments in CommonJoinOperator for aliasTag is not valid
[ https://issues.apache.org/jira/browse/HIVE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708713#comment-13708713 ] Hudson commented on HIVE-4819: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4819 : Comments in CommonJoinOperator for aliasTag is not valid (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501129) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java Comments in CommonJoinOperator for aliasTag is not valid Key: HIVE-4819 URL: https://issues.apache.org/jira/browse/HIVE-4819 Project: Hive Issue Type: Task Components: Documentation Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4819.D11619.1.patch I've written that but it's does not make sense even to me. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4840) Fix eclipse template classpath to include the BoneCP lib
[ https://issues.apache.org/jira/browse/HIVE-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708715#comment-13708715 ] Hudson commented on HIVE-4840: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4840 : Fix eclipse template classpath to include the BoneCP lib (Yin Huai via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1502678) * /hive/trunk/eclipse-templates/.classpath * /hive/trunk/hcatalog/src/test/e2e/hcatalog/drivers/Util.pm Fix eclipse template classpath to include the BoneCP lib Key: HIVE-4840 URL: https://issues.apache.org/jira/browse/HIVE-4840 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4840.patch.txt HIVE-4807 did not change the classpath in eclipse template accordingly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4812) Logical explain plan
[ https://issues.apache.org/jira/browse/HIVE-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708717#comment-13708717 ] Hudson commented on HIVE-4812: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4812 : Logical explain plan (Gunther Hagleitner V via Navis) (navis: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501036) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java * /hive/trunk/ql/src/test/queries/clientpositive/explain_logical.q * /hive/trunk/ql/src/test/results/clientpositive/explain_logical.q.out Logical explain plan Key: HIVE-4812 URL: https://issues.apache.org/jira/browse/HIVE-4812 Project: Hive Issue Type: Bug Components: Diagnosability Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4812.1.patch, HIVE-4812.2.patch, HIVE-4812.3.patch In various situations it would have been useful to me to glance at the operator plan before we break it into tasks and apply join, total order sort, etc optimizations. I've added this as an options to explain. Explain logical QUERY will output the full operator tree (not the stage plans, tasks, AST etc). Again, I don't think this has to even be documented for users, but might be useful to developers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4829) TestWebHCatE2e checkstyle violation causes all tests to fail
[ https://issues.apache.org/jira/browse/HIVE-4829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708724#comment-13708724 ] Hudson commented on HIVE-4829: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #14 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/14/]) HIVE-4829 : TestWebHCatE2e checkstyle violation causes all tests to fail (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1501463) * /hive/trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hcatalog/templeton/TestWebHCatE2e.java TestWebHCatE2e checkstyle violation causes all tests to fail Key: HIVE-4829 URL: https://issues.apache.org/jira/browse/HIVE-4829 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Brock Noland Assignee: Eugene Koifman Priority: Critical Fix For: 0.12.0 Attachments: HIVE-4829.patch The following error caused all tests to fail and thus filled up the ptest systems drives. {noformat} [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 419 files [checkstyle] /home/hiveptest/ip-10-234-1-228-hiveptest-1/apache-github-source/hcatalog/webhcat/svr/src/test/java/org/apache/hcatalog/templeton/TestWebHCatE2e.java:31:8: Unused import - org.junit.BeforeClass. [for] hcatalog: The following error occurred while executing this line: [for] /home/hiveptest/ip-10-234-1-228-hiveptest-1/apache-github-source/build.xml:310: The following error occurred while executing this line: [for] /home/hiveptest/ip-10-234-1-228-hiveptest-1/apache-github-source/hcatalog/build.xml:123: The following error occurred while executing this line: [for] /home/hiveptest/ip-10-234-1-228-hiveptest-1/apache-github-source/hcatalog/build-support/ant/checkstyle.xml:32: Got 1 errors and 0 warnings. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2436) Update project naming and description in Hive website
[ https://issues.apache.org/jira/browse/HIVE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708740#comment-13708740 ] Brock Noland commented on HIVE-2436: Good question...the site logs are available on people.apache.org at /x1/logarchive/aurora-2012/www/2013/ but other than it sounds like there is [no pre-created statistics|http://www.apache.org/dev/project-site.html]. I think from a site perspective, we can project patches which touch the source and then a committer and build and checkin the built changes as well as the source changes. Update project naming and description in Hive website - Key: HIVE-2436 URL: https://issues.apache.org/jira/browse/HIVE-2436 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Brock Noland http://www.apache.org/foundation/marks/pmcs.html#naming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708758#comment-13708758 ] Edward Capriolo commented on HIVE-4518: --- I will check this out later tonight. Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4518) Counter Strike: Operation Operator
[ https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-4518: -- Issue Type: Improvement (was: Bug) Counter Strike: Operation Operator -- Key: HIVE-4518 URL: https://issues.apache.org/jira/browse/HIVE-4518 Project: Hive Issue Type: Improvement Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch, HIVE-4518.4.patch, HIVE-4518.5.patch Queries of the form: from foo insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... insert overwrite table bar partition (p) select ... Generate a huge amount of counters. The reason is that task.progress is turned on for dynamic partitioning queries. The counters not only make queries slower than necessary (up to 50%) you will also eventually run out. That's because we're wrapping them in enum values to comply with hadoop 0.17. The real reason we turn task.progress on is that we need CREATED_FILES and FATAL counters to ensure dynamic partitioning queries don't go haywire. The counters have counter-intuitive names like C1 through C1000 and don't seem really useful by themselves. With hadoop 20+ you don't need to wrap the counters anymore, each operator can simply create and increment counters. That should simplify the code a lot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2436) Update project naming and description in Hive website
[ https://issues.apache.org/jira/browse/HIVE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-2436: --- Attachment: HIVE-2436.patch Hi, Attached is a patch for this issue. Update project naming and description in Hive website - Key: HIVE-2436 URL: https://issues.apache.org/jira/browse/HIVE-2436 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Brock Noland Attachments: HIVE-2436.patch http://www.apache.org/foundation/marks/pmcs.html#naming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2436) Update project naming and description in Hive website
[ https://issues.apache.org/jira/browse/HIVE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-2436: --- Status: Patch Available (was: Open) Update project naming and description in Hive website - Key: HIVE-2436 URL: https://issues.apache.org/jira/browse/HIVE-2436 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Brock Noland Attachments: HIVE-2436.patch http://www.apache.org/foundation/marks/pmcs.html#naming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2438) add trademark attributions to Hive homepage
[ https://issues.apache.org/jira/browse/HIVE-2438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708833#comment-13708833 ] Brock Noland commented on HIVE-2438: [~cwsteinbach] to complete this one we will have to change the skin which is actually externally referenced from the Hadoop SVN tree: {noformat} [brock@bigboy site]$ svn info author/src/documentation/skins Path: author/src/documentation/skins URL: http://svn.apache.org/repos/asf/hadoop/common/site/main/author/src/documentation/skins Repository Root: http://svn.apache.org/repos/asf ... {noformat} therefore I think we should remove this external dependency and copy the skin into our SVN tree. Then we can update the footer without conflicting with Hadoop. Thoughts? add trademark attributions to Hive homepage --- Key: HIVE-2438 URL: https://issues.apache.org/jira/browse/HIVE-2438 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Carl Steinbach http://www.apache.org/foundation/marks/pmcs.html#attributions -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2436) Update project naming and description in Hive website
[ https://issues.apache.org/jira/browse/HIVE-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708842#comment-13708842 ] Brock Noland commented on HIVE-2436: Along with change we should {noformat} $ cd publish/images curl -O https://issues.apache.org/jira/secure/attachment/12497381/hive_logo_medium.jpg {noformat} Update project naming and description in Hive website - Key: HIVE-2436 URL: https://issues.apache.org/jira/browse/HIVE-2436 Project: Hive Issue Type: Sub-task Reporter: John Sichi Assignee: Brock Noland Attachments: HIVE-2436.patch http://www.apache.org/foundation/marks/pmcs.html#naming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4815) Create cloud hosting option for ptest2
[ https://issues.apache.org/jira/browse/HIVE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland resolved HIVE-4815. Resolution: Duplicate Fix Version/s: (was: 0.12.0) Merging with HIVE-4675. Create cloud hosting option for ptest2 -- Key: HIVE-4815 URL: https://issues.apache.org/jira/browse/HIVE-4815 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4815.patch, HIVE-4815.patch, HIVE-4815.patch, HIVE-4815.patch, HIVE-4815.patch, HIVE-4815.patch HIVE-4675 creates a parallel testing environment. To support HIVE-4739 we should allow this environment to run in a cloud environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4675) Create new parallel unit test environment
[ https://issues.apache.org/jira/browse/HIVE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708848#comment-13708848 ] Brock Noland commented on HIVE-4675: Other than Ashutosh I haven't heard of interest in reviewing this change therefore I am assuming no else has started reviewing it. Therefore I'll go ahead and consolidate patches as previously discussed. Create new parallel unit test environment - Key: HIVE-4675 URL: https://issues.apache.org/jira/browse/HIVE-4675 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4675.patch The current ptest tool is great, but it has the following limitations: -Requires an NFS filer -Unless the NFS filer is dedicated ptests can become IO bound easily -Investigating of failures is troublesome because the source directory for the failure is not saved -Ignoring or isolated tests is not supported -No unit tests for the ptest framework exist It'd be great to have a ptest tool that addresses this limitations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4675) Create new parallel unit test environment
[ https://issues.apache.org/jira/browse/HIVE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4675: --- Attachment: HIVE-4675.patch Create new parallel unit test environment - Key: HIVE-4675 URL: https://issues.apache.org/jira/browse/HIVE-4675 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4675.patch, HIVE-4675.patch The current ptest tool is great, but it has the following limitations: -Requires an NFS filer -Unless the NFS filer is dedicated ptests can become IO bound easily -Investigating of failures is troublesome because the source directory for the failure is not saved -Ignoring or isolated tests is not supported -No unit tests for the ptest framework exist It'd be great to have a ptest tool that addresses this limitations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 11770: HIVE-4113: Optimize select count(1) with RCFile and Orc
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11770/ --- (Updated July 15, 2013, 7:47 p.m.) Review request for hive. Changes --- Rebased patch, no real changes. Bugs: HIVE-4113 https://issues.apache.org/jira/browse/HIVE-4113 Repository: hive-git Description --- Modifies ColumnProjectionUtils such there are two flags. One for the column ids and one indicating whether all columns should be read. Additionally the patch updates all locations which uses the old method of empty string indicating all columns should be read. The automatic formatter generated by ant eclipse-files is fairly aggressive so there are some unrelated import/whitespace cleanup. Diffs (updated) - hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java da85501 hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java bc0e04c hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java ac3753f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InitializeInput.java 02ec37f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InternalUtil.java 4167afa hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java b5f22af hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatPartitioned.java dd2ac10 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestHCatLoader.java e907c73 ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a784b2 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java f72ecfb ql/src/java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java 49145b7 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java adf4923 ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java d18d403 ql/src/java/org/apache/hadoop/hive/ql/io/RCFileRecordReader.java 9521060 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 96ac584 ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java cbdc2db ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 400abf3 ql/src/test/org/apache/hadoop/hive/ql/io/PerformTestRCFileAndSeqFile.java fb9fca1 ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java ae6a5ee ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 785f0b1 serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 23180cf serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 11f5f07 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java 1335446 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java e1270cc serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java b717278 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java 0317024 serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java 3ba2699 serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java 99420ca Diff: https://reviews.apache.org/r/11770/diff/ Testing --- All unit tests pass with the patch. ColumnProjectionUtils has new unit tests covering it's functionality. Additionally I verified manually the select count(1) from RCFile/Orc resulted in less IO after the change. Before: hive select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 17.75 sec HDFS Read: 28782851 HDFS Write: 9 SUCCESS hive select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 23.72 sec HDFS Read: 825865962 HDFS Write: 9 SUCCESS After: hive select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 9.9 sec HDFS Read: 67325 HDFS Write: 9 SUCCESS hive select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 16.96 sec HDFS Read: 96045618 HDFS Write: 9 SUCCESS Thanks, Brock Noland
[jira] [Updated] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4113: --- Attachment: HIVE-4113.patch Rebased patch no real changes. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 11770: HIVE-4113: Optimize select count(1) with RCFile and Orc
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11770/ --- (Updated July 15, 2013, 7:51 p.m.) Review request for hive. Changes --- Test was missed, included it now. Bugs: HIVE-4113 https://issues.apache.org/jira/browse/HIVE-4113 Repository: hive-git Description --- Modifies ColumnProjectionUtils such there are two flags. One for the column ids and one indicating whether all columns should be read. Additionally the patch updates all locations which uses the old method of empty string indicating all columns should be read. The automatic formatter generated by ant eclipse-files is fairly aggressive so there are some unrelated import/whitespace cleanup. Diffs (updated) - hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java da85501 hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java bc0e04c hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java ac3753f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InitializeInput.java 02ec37f hcatalog/core/src/main/java/org/apache/hcatalog/mapreduce/InternalUtil.java 4167afa hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatMultiOutputFormat.java b5f22af hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatPartitioned.java dd2ac10 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestHCatLoader.java e907c73 ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a784b2 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java f72ecfb ql/src/java/org/apache/hadoop/hive/ql/io/BucketizedHiveInputFormat.java 49145b7 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java adf4923 ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java d18d403 ql/src/java/org/apache/hadoop/hive/ql/io/RCFileRecordReader.java 9521060 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 96ac584 ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java cbdc2db ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 400abf3 ql/src/test/org/apache/hadoop/hive/ql/io/PerformTestRCFileAndSeqFile.java fb9fca1 ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java ae6a5ee ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 785f0b1 serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java 23180cf serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 11f5f07 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStruct.java 1335446 serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarStructBase.java e1270cc serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java b717278 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarStruct.java 0317024 serde/src/test/org/apache/hadoop/hive/serde2/TestColumnProjectionUtils.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java 3ba2699 serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java 99420ca Diff: https://reviews.apache.org/r/11770/diff/ Testing --- All unit tests pass with the patch. ColumnProjectionUtils has new unit tests covering it's functionality. Additionally I verified manually the select count(1) from RCFile/Orc resulted in less IO after the change. Before: hive select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 17.75 sec HDFS Read: 28782851 HDFS Write: 9 SUCCESS hive select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 23.72 sec HDFS Read: 825865962 HDFS Write: 9 SUCCESS After: hive select count(1) from users_orc; Job 0: Map: 1 Reduce: 1 Cumulative CPU: 9.9 sec HDFS Read: 67325 HDFS Write: 9 SUCCESS hive select count(1) from users_rc; Job 0: Map: 3 Reduce: 1 Cumulative CPU: 16.96 sec HDFS Read: 96045618 HDFS Write: 9 SUCCESS Thanks, Brock Noland
[jira] [Updated] (HIVE-4113) Optimize select count(1) with RCFile and Orc
[ https://issues.apache.org/jira/browse/HIVE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4113: --- Attachment: HIVE-4113.patch Test was missed, I have included it here. Optimize select count(1) with RCFile and Orc Key: HIVE-4113 URL: https://issues.apache.org/jira/browse/HIVE-4113 Project: Hive Issue Type: Bug Components: File Formats Reporter: Gopal V Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4113-0.patch, HIVE-4113.patch, HIVE-4113.patch select count(1) loads up every column every row when used with RCFile. select count(1) from store_sales_10_rc gives {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 31.73 sec HDFS Read: 234914410 HDFS Write: 8 SUCCESS {code} Where as, select count(ss_sold_date_sk) from store_sales_10_rc; reads far less {code} Job 0: Map: 5 Reduce: 1 Cumulative CPU: 29.75 sec HDFS Read: 28145994 HDFS Write: 8 SUCCESS {code} Which is 11% of the data size read by the COUNT(1). This was tracked down to the following code in RCFile.java {code} } else { // TODO: if no column name is specified e.g, in select count(1) from tt; // skip all columns, this should be distinguished from the case: // select * from tt; for (int i = 0; i skippedColIDs.length; i++) { skippedColIDs[i] = false; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4858) Sort show grant result to improve usability and testability
[ https://issues.apache.org/jira/browse/HIVE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4858: -- Attachment: HIVE-4858.patch Sort show grant result to improve usability and testability - Key: HIVE-4858 URL: https://issues.apache.org/jira/browse/HIVE-4858 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.11.1 Attachments: HIVE-4858.patch Currently Hive outputs the result of show grant command in no deterministic order. It outputs the set of each privilege type in the order of whatever returned from DB (DataNucleus). Randomness can arise and tests (depending on the order) can fail, especially in events of library upgrade (DN or JVM upgrade). Sorting the result will avoid the potential randomness and make the output more deterministic, thus not only improving the readability of the output but also making the test more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4721) Fix TestCliDriver.ptf_npath.q on 0.23
[ https://issues.apache.org/jira/browse/HIVE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708929#comment-13708929 ] Hudson commented on HIVE-4721: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #84 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/84/]) HIVE-4721 : Fix TestCliDriver.ptf_npath.q on 0.23 (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503297) * /hive/trunk/ql/src/test/queries/clientpositive/ptf_npath.q * /hive/trunk/ql/src/test/results/clientpositive/ptf_npath.q.out Fix TestCliDriver.ptf_npath.q on 0.23 - Key: HIVE-4721 URL: https://issues.apache.org/jira/browse/HIVE-4721 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4721.1.patch In HIVE-4717 I tried changing the last line of ptf_npath.q from: {noformat} where fl_num = 1142; {noformat} to: {noformat} where fl_num = 1142 order by origin_city_name, fl_num, year, month, day_of_month, sz, tpath; {noformat} in order to make the test deterministic. However this results, not just different order, in different results for 0.23 and 0.20S. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708928#comment-13708928 ] Hudson commented on HIVE-4854: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #84 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/84/]) HIVE-4854 : testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503304) * /hive/trunk/ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 - Key: HIVE-4854 URL: https://issues.apache.org/jira/browse/HIVE-4854 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4854.1.patch Problem is with mkdir command. It tries to generate multiple directories at once without the right flag. That works only on the 1 line. Simple fix to the test should do the trick -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4853) junit timeout needs to be updated
[ https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708927#comment-13708927 ] Hudson commented on HIVE-4853: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #84 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/84/]) HIVE-4853 : junit timeout needs to be updated (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503301) * /hive/trunk/build.properties junit timeout needs to be updated - Key: HIVE-4853 URL: https://issues.apache.org/jira/browse/HIVE-4853 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4853.1.patch All the ptf, join etc tests we've added recently have pushed the junit time for TestCliDriver past the timout value on most machines (if run serially). The build machine uses it's own value - so you don't see it there. But when running locally it can be a real downer if you find out that the tests were aborted due to timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4852) -Dbuild.profile=core fails
[ https://issues.apache.org/jira/browse/HIVE-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708926#comment-13708926 ] Hudson commented on HIVE-4852: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #84 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/84/]) HIVE-4852 : -Dbuild.profile=core fails (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503309) * /hive/trunk/build.xml -Dbuild.profile=core fails -- Key: HIVE-4852 URL: https://issues.apache.org/jira/browse/HIVE-4852 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4852.1.patch Core profile fails because of an added chmod to some hcat files. Simple fix: Check if modules contains hcat before running the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-556) let hive support theta join
[ https://issues.apache.org/jira/browse/HIVE-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708948#comment-13708948 ] Brock Noland commented on HIVE-556: --- I'd like to break this up into sub tasks but before I do I'd like to solicit feedback... let hive support theta join --- Key: HIVE-556 URL: https://issues.apache.org/jira/browse/HIVE-556 Project: Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Brock Noland Right now , hive only support equal joins . Sometimes it's not enough, we must consider implementing theta joins like {code:sql} SELECT a.subid, a.id, t.url FROM tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern WHERE t.dt='20090609' AND a.dt='20090609'; {code} any condition expression following 'ON' is appropriate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4859) String column comparison classes should be renamed.
Jitendra Nath Pandey created HIVE-4859: -- Summary: String column comparison classes should be renamed. Key: HIVE-4859 URL: https://issues.apache.org/jira/browse/HIVE-4859 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey FilterStringColEqualStringCol should be renamed to FilterStringColEqualStringColumn. Similarly, all string comparison classes should be renamed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Skew Joins borked on Hive11 (Hadoop23)?
Hello, all. Has anyone noticed that skew-joins aren't working on Hive 0.11 / Hadoop 0.23? I've been running the TPC-h benchmarks against Hive 0.11, and I see that none of the queries run through if hive.optimize.skewjoin is set to true. I initially ran into problems like the following: quote Ended Job = job_1371646843240_1214 java.io.FileNotFoundException: File hdfs://fstaxxx.yyy.yahoo.com/tmp/hive_2013-07-12_03-22-31_737_6843191588894968654/-mr-10004/hive_skew_join_bigkeys_0 does not exist. /quote Patching Hive 0.11 with HIVE-4646 resolved that problem. What I see now is that a couple of stages of the query run through successfully, after which I get the following message, and the remaining stages are skipped. quote 2013-07-12 23:21:02,164 Stage-3 map = 100%, reduce = 100%, Cumulative CPU 15985.47 sec MapReduce Total cumulative CPU time: 0 days 4 hours 26 minutes 25 seconds 470 msec Ended Job = job_1371646843240_1295 Stage-10 is filtered out by condition resolver. MapReduce Jobs Launched: Job 0: Map: 380 Reduce: 118 Cumulative CPU: 15900.35 sec HDFS Read: 24574270287 HDFS Write: 4925478398 SUCCESS Total MapReduce CPU Time Spent: 0 days 4 hours 25 minutes 0 seconds 350 msec OK Time taken: 109.411 seconds FAILED: SemanticException [Error 10001]: Line 10:5 Table not found 'q16_tmp_cached' /quote In this particular case, the query is q16_parts_supplier_relationship.hive, part of which looks like: quote create table q16_tmp_cached as select p_brand, p_type, p_size, ps_suppkey from partsupp ps join part p on p.p_partkey = ps.ps_partkey and p.p_brand 'Brand#45' and not p.p_type like 'MEDIUM POLISHED%' join supplier_tmp_cached s on ps.ps_suppkey = s.s_suppkey; /quote If I can isolate the problem to a smaller test-case, I'll raise a JIRA. I was hoping one of you might have seen this already, or might have a better handle of how skew-joins work in Hive 11. Many thanks, Mithun
[jira] [Updated] (HIVE-4859) String column comparison classes should be renamed.
[ https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-4859: --- Attachment: HIVE-4859.1.patch String column comparison classes should be renamed. --- Key: HIVE-4859 URL: https://issues.apache.org/jira/browse/HIVE-4859 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-4859.1.patch FilterStringColEqualStringCol should be renamed to FilterStringColEqualStringColumn. Similarly, all string comparison classes should be renamed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4859) String column comparison classes should be renamed.
[ https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-4859: --- Status: Patch Available (was: Open) String column comparison classes should be renamed. --- Key: HIVE-4859 URL: https://issues.apache.org/jira/browse/HIVE-4859 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-4859.1.patch FilterStringColEqualStringCol should be renamed to FilterStringColEqualStringColumn. Similarly, all string comparison classes should be renamed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4859) String column comparison classes should be renamed.
[ https://issues.apache.org/jira/browse/HIVE-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708996#comment-13708996 ] Jitendra Nath Pandey commented on HIVE-4859: Patch uploaded. https://reviews.apache.org/r/12560/ String column comparison classes should be renamed. --- Key: HIVE-4859 URL: https://issues.apache.org/jira/browse/HIVE-4859 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-4859.1.patch FilterStringColEqualStringCol should be renamed to FilterStringColEqualStringColumn. Similarly, all string comparison classes should be renamed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-1511: --- Attachment: HIVE-1511.patch Plan serialization and deserialization is still because of java serialization is way too slow. I played with Kryo library ( http://code.google.com/p/kryo/ ) which is super-fast for java object graph serialization and initial results looks promising. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Attachments: HIVE-1511.patch As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request 12562: HIVE-4858: Sort show grant result to improve usability and testability
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12562/ --- Review request for hive. Bugs: HIVE-4858 https://issues.apache.org/jira/browse/HIVE-4858 Repository: hive-git Description --- Patch includes code changes and query output re-generation for a couple of test cases. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 9883659 ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out 4262b7c ql/src/test/results/clientpositive/authorization_2.q.out c934a2a ql/src/test/results/clientpositive/authorization_6.q.out b8483ca Diff: https://reviews.apache.org/r/12562/diff/ Testing --- Performed the authorization test cases. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-4858) Sort show grant result to improve usability and testability
[ https://issues.apache.org/jira/browse/HIVE-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709069#comment-13709069 ] Xuefu Zhang commented on HIVE-4858: --- Review board: https://reviews.apache.org/r/12562/ Sort show grant result to improve usability and testability - Key: HIVE-4858 URL: https://issues.apache.org/jira/browse/HIVE-4858 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.10.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.11.1 Attachments: HIVE-4858.patch Currently Hive outputs the result of show grant command in no deterministic order. It outputs the set of each privilege type in the order of whatever returned from DB (DataNucleus). Randomness can arise and tests (depending on the order) can fail, especially in events of library upgrade (DN or JVM upgrade). Sorting the result will avoid the potential randomness and make the output more deterministic, thus not only improving the readability of the output but also making the test more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12562: HIVE-4858: Sort show grant result to improve usability and testability
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12562/#review23181 --- Looks good! I do think we should align the style with the predominant hive style though. ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java https://reviews.apache.org/r/12562/#comment47041 I think the spaces on the left and right hand side are style differences. Most of the code in hive I've seen does not have the additional spaces. Also the code nearby does not have those spaces. ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java https://reviews.apache.org/r/12562/#comment47039 trailing whitespace - Brock Noland On July 15, 2013, 10:28 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12562/ --- (Updated July 15, 2013, 10:28 p.m.) Review request for hive. Bugs: HIVE-4858 https://issues.apache.org/jira/browse/HIVE-4858 Repository: hive-git Description --- Patch includes code changes and query output re-generation for a couple of test cases. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 9883659 ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out 4262b7c ql/src/test/results/clientpositive/authorization_2.q.out c934a2a ql/src/test/results/clientpositive/authorization_6.q.out b8483ca Diff: https://reviews.apache.org/r/12562/diff/ Testing --- Performed the authorization test cases. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-4721) Fix TestCliDriver.ptf_npath.q on 0.23
[ https://issues.apache.org/jira/browse/HIVE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709119#comment-13709119 ] Hudson commented on HIVE-4721: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #15 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/15/]) HIVE-4721 : Fix TestCliDriver.ptf_npath.q on 0.23 (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503297) * /hive/trunk/ql/src/test/queries/clientpositive/ptf_npath.q * /hive/trunk/ql/src/test/results/clientpositive/ptf_npath.q.out Fix TestCliDriver.ptf_npath.q on 0.23 - Key: HIVE-4721 URL: https://issues.apache.org/jira/browse/HIVE-4721 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4721.1.patch In HIVE-4717 I tried changing the last line of ptf_npath.q from: {noformat} where fl_num = 1142; {noformat} to: {noformat} where fl_num = 1142 order by origin_city_name, fl_num, year, month, day_of_month, sz, tpath; {noformat} in order to make the test deterministic. However this results, not just different order, in different results for 0.23 and 0.20S. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4853) junit timeout needs to be updated
[ https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709117#comment-13709117 ] Hudson commented on HIVE-4853: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #15 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/15/]) HIVE-4853 : junit timeout needs to be updated (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503301) * /hive/trunk/build.properties junit timeout needs to be updated - Key: HIVE-4853 URL: https://issues.apache.org/jira/browse/HIVE-4853 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4853.1.patch All the ptf, join etc tests we've added recently have pushed the junit time for TestCliDriver past the timout value on most machines (if run serially). The build machine uses it's own value - so you don't see it there. But when running locally it can be a real downer if you find out that the tests were aborted due to timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709118#comment-13709118 ] Hudson commented on HIVE-4854: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #15 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/15/]) HIVE-4854 : testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503304) * /hive/trunk/ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2 - Key: HIVE-4854 URL: https://issues.apache.org/jira/browse/HIVE-4854 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4854.1.patch Problem is with mkdir command. It tries to generate multiple directories at once without the right flag. That works only on the 1 line. Simple fix to the test should do the trick -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4852) -Dbuild.profile=core fails
[ https://issues.apache.org/jira/browse/HIVE-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709116#comment-13709116 ] Hudson commented on HIVE-4852: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #15 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/15/]) HIVE-4852 : -Dbuild.profile=core fails (Gunther Hagleitner via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1503309) * /hive/trunk/build.xml -Dbuild.profile=core fails -- Key: HIVE-4852 URL: https://issues.apache.org/jira/browse/HIVE-4852 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4852.1.patch Core profile fails because of an added chmod to some hcat files. Simple fix: Check if modules contains hcat before running the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4860) add shortcut to gather column statistics on all columns
Greg Rahn created HIVE-4860: --- Summary: add shortcut to gather column statistics on all columns Key: HIVE-4860 URL: https://issues.apache.org/jira/browse/HIVE-4860 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.11.0 Reporter: Greg Rahn Currently analyze table ... compute statistics for columns requires a discrete list of columns. It would be nice to have a shortcut to gather stats on all columns w/o naming them. Possible options: analyze table ... compute statistics for ALL columns; -- ALL keyword analyze table ... compute statistics for columns; -- empty list defaults to all columns -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4861) add support for dynamic partitioning when gathering column statistics
Greg Rahn created HIVE-4861: --- Summary: add support for dynamic partitioning when gathering column statistics Key: HIVE-4861 URL: https://issues.apache.org/jira/browse/HIVE-4861 Project: Hive Issue Type: Improvement Reporter: Greg Rahn Fix For: 0.11.1 Currently: hive analyze table fact_table partition(event_date) compute statistics for columns ...; FAILED: SemanticException [Error 30008]: Dynamic partitioning is not supported yet while gathering column statistics through ANALYZE statement Add functionality that removes this restriction and allows gathering column statistics on all partitions of a table using dynamic partitioning; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2482) Convenience UDFs for binary data type
[ https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan reassigned HIVE-2482: - Assignee: Mark Wagner Convenience UDFs for binary data type - Key: HIVE-2482 URL: https://issues.apache.org/jira/browse/HIVE-2482 Project: Hive Issue Type: New Feature Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Assignee: Mark Wagner HIVE-2380 introduced binary data type in Hive. It will be good to have following udfs to make it more useful: * UDF's to convert to/from hex string * UDF's to convert to/from string using a specific encoding * UDF's to convert to/from base64 string * UDF's to convert to/from non-string types using a particular serde -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 12480: HIVE-4732 Reduce or eliminate the expensive Schema equals() check for AvroSerde
On July 12, 2013, 10:44 p.m., Jakob Homan wrote: Do you have after-optimization performance numbers? Can you add a test to verify that the reencoder cache is working correctly? Feed in a record with one uuid, then another with a different and verify that the cache has two elements. Adding a third record with the original UUID shouldn't increase the size of the cache. Also, that adding n records all with the same schema creates only one reencoder... Yes we have the number after optimization. For example, each record used to take nearly 50 micro-second. After this patch, it becomes nearly 31 micro-seconds. Added the test case as proposed. On July 12, 2013, 10:44 p.m., Jakob Homan wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java, line 66 https://reviews.apache.org/r/12480/diff/1/?file=320688#file320688line66 verifiedRecordReaders - noReencodingNeeded ? Done On July 12, 2013, 10:44 p.m., Jakob Homan wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java, line 155 https://reviews.apache.org/r/12480/diff/1/?file=320688#file320688line155 readability: pull out getRecordReaderID into its own var Done On July 12, 2013, 10:44 p.m., Jakob Homan wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java, line 78 https://reviews.apache.org/r/12480/diff/1/?file=320689#file320689line78 Need to write out the uuid too Done On July 12, 2013, 10:44 p.m., Jakob Homan wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java, line 92 https://reviews.apache.org/r/12480/diff/1/?file=320689#file320689line92 Need to read in the uuid too Done - Mohammad --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/#review23113 --- On July 11, 2013, 10:31 p.m., Mohammad Islam wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/ --- (Updated July 11, 2013, 10:31 p.m.) Review request for hive, Ashutosh Chauhan and Jakob Homan. Bugs: HIVE-4732 https://issues.apache.org/jira/browse/HIVE-4732 Repository: hive-git Description --- From our performance analysis, we found AvroSerde's schema.equals() call consumed a substantial amount ( nearly 40%) of time. This patch intends to minimize the number schema.equals() calls by pushing the check as late/fewer as possible. At first, we added a unique id for each record reader which is then included in every AvroGenericRecordWritable. Then, we introduce two new data structures (one hashset and one hashmap) to store intermediate data to avoid duplicates checkings. Hashset contains all the record readers' IDs that don't need any re-encoding. On the other hand, HashMap contains the already used re-encoders. It works as cache and allows re-encoders reuse. With this change, our test shows nearly 40% reduction in Avro record reading time. Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java dbc999f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java c85ef15 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java 66f0348 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestSchemaReEncoder.java 9af751b serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java 2b948eb Diff: https://reviews.apache.org/r/12480/diff/ Testing --- Thanks, Mohammad Islam
Re: Review Request 12480: HIVE-4732 Reduce or eliminate the expensive Schema equals() check for AvroSerde
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/12480/ --- (Updated July 15, 2013, 11:48 p.m.) Review request for hive, Ashutosh Chauhan and Jakob Homan. Changes --- Incorporated Jakob's comment. Bugs: HIVE-4732 https://issues.apache.org/jira/browse/HIVE-4732 Repository: hive-git Description --- From our performance analysis, we found AvroSerde's schema.equals() call consumed a substantial amount ( nearly 40%) of time. This patch intends to minimize the number schema.equals() calls by pushing the check as late/fewer as possible. At first, we added a unique id for each record reader which is then included in every AvroGenericRecordWritable. Then, we introduce two new data structures (one hashset and one hashmap) to store intermediate data to avoid duplicates checkings. Hashset contains all the record readers' IDs that don't need any re-encoding. On the other hand, HashMap contains the already used re-encoders. It works as cache and allows re-encoders reuse. With this change, our test shows nearly 40% reduction in Avro record reading time. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/avro/AvroGenericRecordReader.java dbc999f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java c85ef15 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroGenericRecordWritable.java 66f0348 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 79c9646 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestSchemaReEncoder.java 9af751b serde/src/test/org/apache/hadoop/hive/serde2/avro/Utils.java 2b948eb Diff: https://reviews.apache.org/r/12480/diff/ Testing --- Thanks, Mohammad Islam
[jira] [Assigned] (HIVE-4266) Refactor HCatalog code to org.apache.hive.hcatalog
[ https://issues.apache.org/jira/browse/HIVE-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-4266: Assignee: Eugene Koifman (was: Alan Gates) Refactor HCatalog code to org.apache.hive.hcatalog -- Key: HIVE-4266 URL: https://issues.apache.org/jira/browse/HIVE-4266 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Eugene Koifman Priority: Blocker Fix For: 0.12.0 Currently HCatalog code is in packages org.apache.hcatalog. It needs to now move to org.apache.hive.hcatalog. Shell classes/interface need to be created for public facing classes so that user's code does not break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4460) Publish HCatalog artifacts for Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-4460: Assignee: Eugene Koifman Publish HCatalog artifacts for Hadoop 2.x - Key: HIVE-4460 URL: https://issues.apache.org/jira/browse/HIVE-4460 Project: Hive Issue Type: Bug Components: HCatalog Environment: Hadoop 2.x Reporter: Venkat Ranganathan Assignee: Eugene Koifman HCatalog artifacts are only published for Hadoop 1.x version. As more projects add HCatalog integration, the need for HCatalog artifcats on Hadoop versions supported by the product is needed so that automated builds that target different Hadoop releases can be built successfully. For example SQOOP-931 introduces Sqoop/HCatalog integration and Sqoop builds with Hadoop 1.x and 2.x releases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Tez branch and tez based patches
On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote: I have started to see several re factoring patches around tez. https://issues.apache.org/jira/browse/HIVE-4843 This is the only mention on the hive list I can find with tez: Makes sense. I will create the branch soon. Thanks, Ashutosh On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner ghagleit...@hortonworks.com wrote: Hi, I am starting to work on integrating Tez into Hive (see HIVE-4660, design doc has already been uploaded - any feedback will be much appreciated). This will be a fair amount of work that will take time to stabilize/test. I'd like to propose creating a branch in order to be able to do this incrementally and collaboratively. In order to progress rapidly with this, I would also like to go commit-then-review. Thanks, Gunther. These refactor-ings are largely destructive to a number of bugs and language improvements in hive.The language improvements and bug fixes that have been sitting in Jira for quite some time now marked patch-available and are waiting for review. There are a few things I want to point out: 1) Normally we create design docs in out wiki (which it is not) 2) Normally when the change is significantly complex we get multiple committers to comment on it (which we did not) On point 2 no one -1 the branch, but this is really something that should have required a +1 from 3 committers. The Hive bylaws, https://cwiki.apache.org/confluence/display/Hive/Bylaws , lay out what votes are needed for what. I don't see anything there about needing 3 +1s for a branch. Branching would seem to fall under code change, which requires one vote and a minimum length of 1 day. I for one am not completely sold on Tez. http://incubator.apache.org/projects/tez.html. directed-acyclic-graph of tasks for processing data this description sounds like many things which have never become popular. One to think of is oozie Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.. I am sure I can find a number of libraries/frameworks that make this same claim. In general I do not feel like we have done our homework and pre-requisites to justify all this work. If we have done the homework, I am sure that it has not been communicated and accepted by hive developers at large. A request for better documentation on Tez and a project road map seems totally reasonable. If we have a branch, why are we also committing on trunk? Scanning through the tez doc the only language I keep finding language like minimal changes to the planner yet, there is ALREADY lots of large changes going on! Really none of the above would bother me accept for the fact that these minimal changes are causing many patch available ready-for-review bugs and core hive features to need to be re based. I am sure I have mentioned this before, but I have to spend 12+ hours to test a single patch on my laptop. A few days ago I was testing a new core hive feature. After all the tests passed and before I was able to commit, someone unleashed a tez patch on trunk which caused the thing I was testing for 12 hours to need to be rebased. I'm not cool with this.Next time that happens to me I will seriously consider reverting the patch. Bug fixes and new hive features are more important to me then integrating with incubator projects. (With my Apache member hat on) Reverting patches that aren't breaking the build is considered very bad form in Apache. It does make sense to request that when people are going to commit a patch that will break many other patches they first give a few hours of notice so people can say something if they're about to commit another patch and avoid your fate of needing to rerun the tests. The other thing is we need to get get the automated build of patches working on Hive so committers are forced to run all of the tests themselves. We are working on it, but we're not there yet. Alan.
[jira] [Created] (HIVE-4862) Create automatic Patch Available testing
Brock Noland created HIVE-4862: -- Summary: Create automatic Patch Available testing Key: HIVE-4862 URL: https://issues.apache.org/jira/browse/HIVE-4862 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland It'd be awesome if we used the new ptest2 environment to automatically test patches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4862) Create automatic Patch Available testing
[ https://issues.apache.org/jira/browse/HIVE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709298#comment-13709298 ] Brock Noland commented on HIVE-4862: I think I can get this working quite soon. Create automatic Patch Available testing -- Key: HIVE-4862 URL: https://issues.apache.org/jira/browse/HIVE-4862 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland It'd be awesome if we used the new ptest2 environment to automatically test patches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4820) webhcat_config.sh should set default values for HIVE_HOME and HCAT_PREFIX that work with default build tree structure
[ https://issues.apache.org/jira/browse/HIVE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-4820: - Status: Open (was: Patch Available) webhcat_config.sh should set default values for HIVE_HOME and HCAT_PREFIX that work with default build tree structure - Key: HIVE-4820 URL: https://issues.apache.org/jira/browse/HIVE-4820 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.12.0 Attachments: HIVE4820.patch Currently they are expected to be set by the user which makes development inconvenient. It makes writing unit tests for WebHcat more difficult as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1511) Hive plan serialization is slow
[ https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709312#comment-13709312 ] Edward Capriolo commented on HIVE-1511: --- Maybe protobuf since we have it in trunk now. Hive plan serialization is slow --- Key: HIVE-1511 URL: https://issues.apache.org/jira/browse/HIVE-1511 Project: Hive Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Ning Zhang Attachments: HIVE-1511.patch As reported by Edward Capriolo: For reference I did this as a test case SELECT * FROM src where key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR key=0 OR ...(100 more of these) No OOM but I gave up after the test case did not go anywhere for about 2 minutes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira