date:20130715


 [ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4825:
-

Attachment: HIVE-4825.2.testfiles.patch
HIVE-4825.2.code.patch

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708256#comment-13708256
 ] 

Gunther Hagleitner commented on HIVE-4825:
--

Ran tests on 1  2 line. Came back clean. Split in code + test, because there's 
lots of whitespace only diffs. I've also updated the review.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


 [ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4825:
-

Status: Patch Available  (was: Open)

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3576) Regression: ALTER TABLE DROP IF EXISTS PARTITION throws a SemanticException if Partition is not found

2013-07-15 Thread Kanwaljit Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708263#comment-13708263
 ] 

Kanwaljit Singh commented on HIVE-3576:
---

Any updates on this issue?

 Regression: ALTER TABLE DROP IF EXISTS PARTITION throws a SemanticException 
 if Partition is not found
 -

 Key: HIVE-3576
 URL: https://issues.apache.org/jira/browse/HIVE-3576
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Affects Versions: 0.9.0
Reporter: Harsh J

 Doing a simple {{ALTER TABLE testtable DROP IF EXISTS 
 PARTITION(dt=NONEXISTENTPARTITION)}} fails with a SemanticException of the 
 10006 kind (INVALID_PARTITION).
 This does not respect the {{hive.exec.drop.ignorenonexistent}} condition 
 either, since there are no if-check-wraps around this area, when fetching 
 partitions from the store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator

[
https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708274#comment-13708274
]

Gunther Hagleitner commented on HIVE-4518:
--

.5 is rebased to trunk. Running tests.

Counter Strike: Operation Operator
--

Key: HIVE-4518
URL: https://issues.apache.org/jira/browse/HIVE-4518
Project: Hive
Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Attachments: HIVE-4518.1.patch, HIVE-4518.2.patch, HIVE-4518.3.patch,
HIVE-4518.4.patch, HIVE-4518.5.patch

Queries of the form:
from foo
insert overwrite table bar partition (p) select ...
insert overwrite table bar partition (p) select ...
insert overwrite table bar partition (p) select ...
Generate a huge amount of counters. The reason is that task.progress is
turned on for dynamic partitioning queries.
The counters not only make queries slower than necessary (up to 50%) you will
also eventually run out. That's because we're wrapping them in enum values to
comply with hadoop 0.17.
The real reason we turn task.progress on is that we need CREATED_FILES and
FATAL counters to ensure dynamic partitioning queries don't go haywire.
The counters have counter-intuitive names like C1 through C1000 and don't
seem really useful by themselves.
With hadoop 20+ you don't need to wrap the counters anymore, each operator
can simply create and increment counters. That should simplify the code a lot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4518) Counter Strike: Operation Operator

[
https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gunther Hagleitner updated HIVE-4518:
-

Attachment: HIVE-4518.5.patch

Counter Strike: Operation Operator
--

[jira] [Commented] (HIVE-4518) Counter Strike: Operation Operator

[
https://issues.apache.org/jira/browse/HIVE-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708275#comment-13708275
]

Gunther Hagleitner commented on HIVE-4518:
--

Updated: https://reviews.facebook.net/D10665 as well with the latest patch

Counter Strike: Operation Operator
--

[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2


[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708276#comment-13708276
 ] 

Gunther Hagleitner commented on HIVE-4388:
--

That sounds like the best option. Are the 0.96 artifacts already available?

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Brock Noland

 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2


[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708504#comment-13708504
 ] 

Brock Noland commented on HIVE-4388:


We'd be building against 0.95 initially until 0.96 was released.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Brock Noland

 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4675) Create new parallel unit test environment


 [ 
https://issues.apache.org/jira/browse/HIVE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4675:
---

Status: Open  (was: Patch Available)

 Create new parallel unit test environment
 -

 Key: HIVE-4675
 URL: https://issues.apache.org/jira/browse/HIVE-4675
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.12.0

 Attachments: HIVE-4675.patch


 The current ptest tool is great, but it has the following limitations:
 -Requires an NFS filer
 -Unless the NFS filer is dedicated ptests can become IO bound easily
 -Investigating of failures is troublesome because the source directory for 
 the failure is not saved
 -Ignoring or isolated tests is not supported
 -No unit tests for the ptest framework exist
 It'd be great to have a ptest tool that addresses this limitations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4845) Correctness issue with MapJoins using the null safe operator


 [ 
https://issues.apache.org/jira/browse/HIVE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4845:
---

Attachment: HIVE-4845.patch

Updated patch based on review. Cleaned up auto-gen'ed equals/hashcode.

 Correctness issue with MapJoins using the null safe operator
 

 Key: HIVE-4845
 URL: https://issues.apache.org/jira/browse/HIVE-4845
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical
 Attachments: HIVE-4845.patch, HIVE-4845.patch


 I found a correctness issue while working on HIVE-4838. The following query 
 from join_nullsafe.q gives different results depending on if it's executed 
 map-side or reduce-side:
 {noformat}
 SELECT /*+ MAPJOIN(a) */ * FROM smb_input1 a JOIN smb_input1 b ON a.key = 
 b.key AND a.value = b.value ORDER BY a.key, a.value, b.key, b.value;
 {noformat}
 For that query, on the map side, rows which should be joined are not. For 
 example, the reduce side outputs this row:
 {noformat}
 a.key   a.value   b.key   b.value
 148 NULL  148 NULL
 {noformat}
 which makes sense since a.key is equal to b.key and a.value is equal to 
 b.value but the current map-side code omits this row. The reason is that 
 MapJoinDoubleKey is used for the map-side join which doesn't properly compare 
 null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4856) Upgrade HCat to 2.0.5-alpha

Brock Noland created HIVE-4856:
--

 Summary: Upgrade HCat to 2.0.5-alpha
 Key: HIVE-4856
 URL: https://issues.apache.org/jira/browse/HIVE-4856
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland


In HIVE-4756 we upgraded Hive to 2.0.5-alpha. I see that HCat specifies it's 
deps differently. We should probably keep them on the same version of Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4856) Upgrade HCat to 2.0.5-alpha


 [ 
https://issues.apache.org/jira/browse/HIVE-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4856:
---

Component/s: HCatalog

 Upgrade HCat to 2.0.5-alpha
 ---

 Key: HIVE-4856
 URL: https://issues.apache.org/jira/browse/HIVE-4856
 Project: Hive
  Issue Type: Task
  Components: HCatalog
Reporter: Brock Noland

 In HIVE-4756 we upgraded Hive to 2.0.5-alpha. I see that HCat specifies it's 
 deps differently. We should probably keep them on the same version of Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4721) Fix TestCliDriver.ptf_npath.q on 0.23


 [ 
https://issues.apache.org/jira/browse/HIVE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4721:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 Fix TestCliDriver.ptf_npath.q on 0.23
 -

 Key: HIVE-4721
 URL: https://issues.apache.org/jira/browse/HIVE-4721
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4721.1.patch


 In HIVE-4717 I tried changing the last line of ptf_npath.q from:
 {noformat}
 where fl_num = 1142;
 {noformat}
 to:
 {noformat}
 where fl_num = 1142 order by origin_city_name, fl_num, year, month, 
 day_of_month, sz, tpath;
 {noformat}
 in order to make the test deterministic. However this results, not just 
 different order, in different results for 0.23 and 0.20S. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4853) junit timeout needs to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4853:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 junit timeout needs to be updated
 -

 Key: HIVE-4853
 URL: https://issues.apache.org/jira/browse/HIVE-4853
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4853.1.patch


 All the ptf, join etc tests we've added recently have pushed the junit time 
 for TestCliDriver past the timout value on most machines (if run serially). 
 The build machine uses it's own value - so you don't see it there. But when 
 running locally it can be a real downer if you find out that the tests were 
 aborted due to timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4854:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
 -

 Key: HIVE-4854
 URL: https://issues.apache.org/jira/browse/HIVE-4854
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4854.1.patch


 Problem is with mkdir command. It tries to generate multiple directories at 
 once without the right flag. That works only on the 1 line. Simple fix to the 
 test should do the trick

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4852) -Dbuild.profile=core fails


 [ 
https://issues.apache.org/jira/browse/HIVE-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4852:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 -Dbuild.profile=core fails
 --

 Key: HIVE-4852
 URL: https://issues.apache.org/jira/browse/HIVE-4852
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4852.1.patch


 Core profile fails because of an added chmod to some hcat files. Simple fix: 
 Check if modules contains hcat before running the command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4853) junit timeout needs to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-4853:
--

Assignee: Ashutosh Chauhan  (was: Gunther Hagleitner)

 junit timeout needs to be updated
 -

 Key: HIVE-4853
 URL: https://issues.apache.org/jira/browse/HIVE-4853
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4853.1.patch


 All the ptf, join etc tests we've added recently have pushed the junit time 
 for TestCliDriver past the timout value on most machines (if run serially). 
 The build machine uses it's own value - so you don't see it there. But when 
 running locally it can be a real downer if you find out that the tests were 
 aborted due to timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4853) junit timeout needs to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-4853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4853:
---

Assignee: Gunther Hagleitner  (was: Ashutosh Chauhan)

 junit timeout needs to be updated
 -

 Key: HIVE-4853
 URL: https://issues.apache.org/jira/browse/HIVE-4853
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4853.1.patch


 All the ptf, join etc tests we've added recently have pushed the junit time 
 for TestCliDriver past the timout value on most machines (if run serially). 
 The build machine uses it's own value - so you don't see it there. But when 
 running locally it can be a real downer if you find out that the tests were 
 aborted due to timeout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4854:
---

Assignee: Gunther Hagleitner  (was: Ashutosh Chauhan)

 testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
 -

 Key: HIVE-4854
 URL: https://issues.apache.org/jira/browse/HIVE-4854
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4854.1.patch


 Problem is with mkdir command. It tries to generate multiple directories at 
 once without the right flag. That works only on the 1 line. Simple fix to the 
 test should do the trick

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4854) testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-4854:
--

Assignee: Ashutosh Chauhan  (was: Gunther Hagleitner)

 testCliDriver_load_hdfs_file_with_space_in_the_name fails on hadoop 2
 -

 Key: HIVE-4854
 URL: https://issues.apache.org/jira/browse/HIVE-4854
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Ashutosh Chauhan
 Fix For: 0.12.0

 Attachments: HIVE-4854.1.patch


 Problem is with mkdir command. It tries to generate multiple directories at 
 once without the right flag. That works only on the 1 line. Simple fix to the 
 test should do the trick

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4857) Hive tests are leaving slop artifacts all over the project

Edward Capriolo created HIVE-4857:
-

 Summary: Hive tests are leaving slop artifacts all over the project
 Key: HIVE-4857
 URL: https://issues.apache.org/jira/browse/HIVE-4857
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo


We used to have a project that would build temporary artifacts in temporary 
directories. Now runs of tests leave stuff all over the place. Making it hard 
to work with the project.

{quote}
[edward@jackintosh hive-trunk2]$ svn stat | more
?   common/src/gen
?   contrib/TempStatsStore
?   contrib/derby.log
?   data/files/local_array_table_1
?   data/files/local_array_table_2
?   data/files/local_array_table_2_withfields
?   data/files/local_array_table_3
?   data/files/local_map_table_1
?   data/files/local_map_table_2
?   data/files/local_map_table_2_withfields
?   data/files/local_map_table_3
?   data/files/local_rctable
?   data/files/local_rctable_out
?   data/files/local_src_table_1
?   data/files/local_src_table_2
?   hbase-handler/TempStatsStore
?   hbase-handler/build
?   hbase-handler/derby.log
?   hcatalog/build
?   hcatalog/core/build
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3632) datanucleus breaks when using JDK7

[
https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuefu Zhang reassigned HIVE-3632:
-

Assignee: Xuefu Zhang

datanucleus breaks when using JDK7
--

Key: HIVE-3632
URL: https://issues.apache.org/jira/browse/HIVE-3632
Project: Hive
Issue Type: Bug
Components: Metastore
Affects Versions: 0.9.1, 0.10.0
Reporter: Chris Drome
Assignee: Xuefu Zhang
Priority: Critical

I found serious problems with datanucleus code when using JDK7, resulting in
some sort of exception being thrown when datanucleus code is entered.
I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6
with JDK7 and there was no visible difference in that the same unit tests
failed.
I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not
fix the failing tests.
I tried upgrading datanucleus to 3.1-release, as per the advise of
http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests
using ASMv4 will allow datanucleus to work with JDK7. I was not successful
with this either.
I tried upgrading datanucleus to 3.1.2. I was not successful with this either.
Regarding datanucleus support for JDK7+, there is the following JIRA
http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81
which suggests that they don't plan to actively support JDK7+ bytecode any
time soon.
I also tested the following JVM parameters found on
http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/
with no success either.
This will become a more serious problem as people move to newer JVMs. If
there are other who have solved this issue, please post how this was done.
Otherwise, it is a topic that I would like to raise for discussion.

[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7

[
https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708639#comment-13708639
]

Xuefu Zhang commented on HIVE-3632:
---

To compile and run hive on JDK7, DataNucleus needs to be upgraded. The current
plan is to upgrade using the following library versions:

datanucleus-api-jdo-3.2.1.jar
datanucleus-rdbms-3.2.1.jar
datanucleus-core-3.2.2.jar.

These versions work for both JDK6 and JDK7. After upgrade, there is only a few
test failures with JDK6. Besides the unit tests, more tests will be conducted.

This is related to HIVE-2084, but the goal here is slightly different. Of
course, the upgrade needs to address all issues that may arise.

Upgrade datanucleus to support JDK7
---

[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7

[
https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708637#comment-13708637
]

Xuefu Zhang commented on HIVE-3632:
---

Since nobody is working on this, I will give it a shot.

Upgrade datanucleus to support JDK7
---

[jira] [Updated] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


 [ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-4825:
--

  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708656#comment-13708656
 ] 

Edward Capriolo commented on HIVE-4825:
---

Also to not commit on trunk if this is only a tez supporting patch. There is a 
tez branch.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708654#comment-13708654
 ] 

Edward Capriolo commented on HIVE-4825:
---

This issues is NOT a bug. It's priority is not major.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4858) Sort show grant result to improve usability and testability

Xuefu Zhang created HIVE-4858:
-

 Summary: Sort show grant result to improve usability and 
testability
 Key: HIVE-4858
 URL: https://issues.apache.org/jira/browse/HIVE-4858
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.11.0, 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.11.1


Currently Hive outputs the result of show grant command in no deterministic 
order. It outputs the set of each privilege type in the order of whatever 
returned from DB (DataNucleus). Randomness can arise and tests (depending on 
the order) can fail, especially in events of library upgrade (DN or JVM 
upgrade). Sorting the result will avoid the potential randomness and make the 
output more deterministic, thus not only improving the readability of the 
output but also making the test more robust.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3632) Upgrade datanucleus to support JDK7

[
https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708673#comment-13708673
]

Ashutosh Chauhan commented on HIVE-3632:

I am all for updating DN. Huge +1
But we need to be wary of
https://issues.apache.org/jira/browse/HIVE-2084?focusedCommentId=13014240page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13014240
Feels like we might need to provide upgrade scripts for folks to migrate
because of this.

Upgrade datanucleus to support JDK7
---

[jira] [Commented] (HIVE-4317) StackOverflowError when add jar concurrently


[ 
https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708675#comment-13708675
 ] 

Brock Noland commented on HIVE-4317:


Where is the SOE occurring? I.e. jdbc client, HS1, HS2, etc?

Please add a review item for this patch, this is described under Review 
Process here: https://cwiki.apache.org/confluence/display/Hive/HowToContribute

 StackOverflowError when add jar concurrently 
 -

 Key: HIVE-4317
 URL: https://issues.apache.org/jira/browse/HIVE-4317
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.9.0, 0.10.0
Reporter: wangwenli
 Attachments: hive-4317.1.patch


 scenario: multiple thread add jar and do select operation by jdbc 
 concurrently , when hiveserver serializeMapRedWork sometimes, it will throw 
 StackOverflowError from XMLEncoder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4730) Join on more than 2^31 records on single reducer failed (wrong results)

2013-07-15 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708691#comment-13708691
 ] 

Phabricator commented on HIVE-4730:
---

brock has commented on the revision HIVE-4730 [jira] Join on more than 2^31 
records on single reducer failed (wrong results).

  Hi Navis,

  Thanks for the patch!  I noted a few style nits.  Just curious, how long did 
the query take to complete?  My guess is far too long to have a q-file test for 
this.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:286 
Is it possible to move this up near the rest of the member variable definitions?

  Ideally it'd be nice to change the LHS to be List but it's possible that 
something in the class requires ArrayList.

REVISION DETAIL
  https://reviews.facebook.net/D11283

To: JIRA, navis
Cc: brock


 Join on more than 2^31 records on single reducer failed (wrong results)
 ---

 Key: HIVE-4730
 URL: https://issues.apache.org/jira/browse/HIVE-4730
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0
Reporter: Gabi Kazav
Assignee: Navis
Priority: Blocker
 Attachments: HIVE-4730.D11283.1.patch


 join on more than 2^31 rows leads to wrong results. for example:
 Create table small_table (p1 string) ROW FORMAT DELIMITEDLINES TERMINATED 
 BY  '\n';
 Create table big_table (p1 string) ROW FORMAT DELIMITEDLINES TERMINATED 
 BY  '\n';
 Loading 1 row to small_table (the value 1).
 Loading 2149580800 rows to big_table with the same value (1 on this case).
 create table output as select a.p1 from  big_table a join small_table b on 
 (a.p1=b.p1);
 select count(*) from output ; will return only 1 row...
 the reducer syslog:
 ...
 2013-06-13 17:20:59,254 INFO ExecReducer: ExecReducer: processing 214700 
 rows: used memory = 32925960
 2013-06-13 17:21:00,745 INFO ExecReducer: ExecReducer: processing 214800 
 rows: used memory = 12815184
 2013-06-13 17:21:02,205 INFO ExecReducer: ExecReducer: processing 214900 
 rows: used memory = 26684552   -- looks like wrong value..
 ...
 2013-06-13 17:21:04,062 INFO ExecReducer: ExecReducer: processed 2149580801 
 rows: used memory = 17715896
 2013-06-13 17:21:04,062 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 
 finished. closing...
 2013-06-13 17:21:04,062 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 
 forwarded 1 rows
 2013-06-13 17:21:05,791 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 
 SKEWJOINFOLLOWUPJOBS:0
 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 
 finished. closing...
 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 
 forwarded 1 rows
 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
 6 finished. closing...
 2013-06-13 17:21:05,792 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
 6 forwarded 0 rows
 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
 TABLE_ID_1_ROWCOUNT:1
 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 5 
 Close done
 2013-06-13 17:21:05,946 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 
 Close done

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4831) QTestUtil based test exiting abnormally on windows fails startup of other QTestUtil tests