date:20130723


 [ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4825:
-

Attachment: HIVE-4825.4.patch

.4 is rebased to trunk

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4916) Add TezWork

Gunther Hagleitner created HIVE-4916:


 Summary: Add TezWork
 Key: HIVE-4916
 URL: https://issues.apache.org/jira/browse/HIVE-4916
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


TezWork is the class that encapsulates all the info needed to execute a single 
Tez job (i.e.: a dag of map or reduce work).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4916) Add TezWork


 [ 
https://issues.apache.org/jira/browse/HIVE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4916:
-

Fix Version/s: tez-branch

 Add TezWork
 ---

 Key: HIVE-4916
 URL: https://issues.apache.org/jira/browse/HIVE-4916
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch


 TezWork is the class that encapsulates all the info needed to execute a 
 single Tez job (i.e.: a dag of map or reduce work).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4916) Add TezWork


 [ 
https://issues.apache.org/jira/browse/HIVE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4916:
-

Attachment: HIVE-4916.1.patch.branch

 Add TezWork
 ---

 Key: HIVE-4916
 URL: https://issues.apache.org/jira/browse/HIVE-4916
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-4916.1.patch.branch


 TezWork is the class that encapsulates all the info needed to execute a 
 single Tez job (i.e.: a dag of map or reduce work).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4917) Tez Job Monitoring

Gunther Hagleitner created HIVE-4917:


 Summary: Tez Job Monitoring
 Key: HIVE-4917
 URL: https://issues.apache.org/jira/browse/HIVE-4917
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch


TezJobMonitor handles monitoring the execution of a Tez dag

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4917) Tez Job Monitoring


 [ 
https://issues.apache.org/jira/browse/HIVE-4917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4917:
-

Attachment: HIVE-4917.1.patch.branch

 Tez Job Monitoring
 --

 Key: HIVE-4917
 URL: https://issues.apache.org/jira/browse/HIVE-4917
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-4917.1.patch.branch


 TezJobMonitor handles monitoring the execution of a Tez dag

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4918) Tez job submission

Gunther Hagleitner created HIVE-4918:


 Summary: Tez job submission
 Key: HIVE-4918
 URL: https://issues.apache.org/jira/browse/HIVE-4918
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch


This patch is to create infrastructure to submit a tez dag. (i.e.: TezTask + 
utils to convert work into a tez dag).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4918) Tez job submission


 [ 
https://issues.apache.org/jira/browse/HIVE-4918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4918:
-

Attachment: HIVE-4918.1.patch.branch

 Tez job submission
 --

 Key: HIVE-4918
 URL: https://issues.apache.org/jira/browse/HIVE-4918
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-4918.1.patch.branch


 This patch is to create infrastructure to submit a tez dag. (i.e.: TezTask + 
 utils to convert work into a tez dag).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4660) Let there be Tez


 [ 
https://issues.apache.org/jira/browse/HIVE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4660:
-

Description: 
Tez is a new application framework built on Hadoop Yarn that can execute 
complex directed acyclic graphs of general data processing tasks. Here's the 
project's page: http://incubator.apache.org/projects/tez.html

The interesting thing about Tez from Hive's perspective is that it will over 
time allow us to overcome inefficiencies in query processing due to having to 
express every algorithm in the map-reduce paradigm.

The barrier to entry is pretty low as well: Tez can actually run unmodified MR 
jobs; But as a first step we can without much trouble start using more of Tez' 
features by taking advantage of the MRR pattern. 

MRR simply means that there can be any number of reduce stages following a 
single map stage - without having to write intermediate results to HDFS and 
re-read them in a new job. This is common when queries require multiple 
shuffles on keys without correlation (e.g.: join - grp by - window function - 
order by)

For more details see the design doc here: 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez

  was:
Tez is a new application framework built on Hadoop Yarn that can execute 
complex directed acyclic graphs of general data processing tasks. Here's the 
project's page: http://incubator.apache.org/projects/tez.html

The interesting thing about Tez from Hive's perspective is that it will over 
time allow us to overcome inefficiencies in query processing due to having to 
express every algorithm in the map-reduce paradigm.

The barrier to entry is pretty low as well: Tez can actually run unmodified MR 
jobs; But as a first step we can without much trouble start using more of Tez' 
features by taking advantage of the MRR pattern. 

MRR simply means that there can be any number of reduce stages following a 
single map stage - without having to write intermediate results to HDFS and 
re-read them in a new job. This is common when queries require multiple 
shuffles on keys without correlation (e.g.: join - grp by - window function - 
order by)

For more details see the design doc here: 


 Let there be Tez
 

 Key: HIVE-4660
 URL: https://issues.apache.org/jira/browse/HIVE-4660
 Project: Hive
  Issue Type: New Feature
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner

 Tez is a new application framework built on Hadoop Yarn that can execute 
 complex directed acyclic graphs of general data processing tasks. Here's the 
 project's page: http://incubator.apache.org/projects/tez.html
 The interesting thing about Tez from Hive's perspective is that it will over 
 time allow us to overcome inefficiencies in query processing due to having to 
 express every algorithm in the map-reduce paradigm.
 The barrier to entry is pretty low as well: Tez can actually run unmodified 
 MR jobs; But as a first step we can without much trouble start using more of 
 Tez' features by taking advantage of the MRR pattern. 
 MRR simply means that there can be any number of reduce stages following a 
 single map stage - without having to write intermediate results to HDFS and 
 re-read them in a new job. This is common when queries require multiple 
 shuffles on keys without correlation (e.g.: join - grp by - window function - 
 order by)
 For more details see the design doc here: 
 https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4660) Let there be Tez


 [ 
https://issues.apache.org/jira/browse/HIVE-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4660:
-

Description: 
Tez is a new application framework built on Hadoop Yarn that can execute 
complex directed acyclic graphs of general data processing tasks. Here's the 
project's page: http://incubator.apache.org/projects/tez.html

The interesting thing about Tez from Hive's perspective is that it will over 
time allow us to overcome inefficiencies in query processing due to having to 
express every algorithm in the map-reduce paradigm.

The barrier to entry is pretty low as well: Tez can actually run unmodified MR 
jobs; But as a first step we can without much trouble start using more of Tez' 
features by taking advantage of the MRR pattern. 

MRR simply means that there can be any number of reduce stages following a 
single map stage - without having to write intermediate results to HDFS and 
re-read them in a new job. This is common when queries require multiple 
shuffles on keys without correlation (e.g.: join - grp by - window function - 
order by)

For more details see the design doc here: 

  was:
Tez is a new application framework built on Hadoop Yarn that can execute 
complex directed acyclic graphs of general data processing tasks. Here's the 
project's page: http://incubator.apache.org/projects/tez.html

The interesting thing about Tez from Hive's perspective is that it will over 
time allow us to overcome inefficiencies in query processing due to having to 
express every algorithm in the map-reduce paradigm.

The barrier to entry is pretty low as well: Tez can actually run unmodified MR 
jobs; But as a first step we can without much trouble start using more of Tez' 
features by taking advantage of the MRR pattern. 

MRR simply means that there can be any number of reduce stages following a 
single map stage - without having to write intermediate results to HDFS and 
re-read them in a new job. This is common when queries require multiple 
shuffles on keys without correlation (e.g.: join - grp by - window function - 
order by)

For more details see the attached design doc.


 Let there be Tez
 

 Key: HIVE-4660
 URL: https://issues.apache.org/jira/browse/HIVE-4660
 Project: Hive
  Issue Type: New Feature
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner

 Tez is a new application framework built on Hadoop Yarn that can execute 
 complex directed acyclic graphs of general data processing tasks. Here's the 
 project's page: http://incubator.apache.org/projects/tez.html
 The interesting thing about Tez from Hive's perspective is that it will over 
 time allow us to overcome inefficiencies in query processing due to having to 
 express every algorithm in the map-reduce paradigm.
 The barrier to entry is pretty low as well: Tez can actually run unmodified 
 MR jobs; But as a first step we can without much trouble start using more of 
 Tez' features by taking advantage of the MRR pattern. 
 MRR simply means that there can be any number of reduce stages following a 
 single map stage - without having to write intermediate results to HDFS and 
 re-read them in a new job. This is common when queries require multiple 
 shuffles on keys without correlation (e.g.: join - grp by - window function - 
 order by)
 For more details see the design doc here: 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request 11925: Hive-3159 Update AvroSerde to determine schema of new tables

2013-07-23 Thread Mohammad Islam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11925/
---

(Updated July 23, 2013, 9:51 a.m.)


Review request for hive, Ashutosh Chauhan and Jakob Homan.


Changes
---

Updated with Jacob feedbacks.


Bugs: HIVE-3159
https://issues.apache.org/jira/browse/HIVE-3159


Repository: hive-git


Description
---

Problem:
Hive doesn't support to create a Avro-based table using HQL create table 
command. It currently requires to specify Avro schema literal or schema file 
name.
For multiple cases, it is very inconvenient for user.
Some of the un-supported use cases:
1. Create table ... Avro-SERDE etc. as SELECT ... from NON-AVRO FILE
2. Create table ... Avro-SERDE etc. as SELECT from AVRO TABLE
3. Create  table  without specifying Avro schema.


Diffs (updated)
-

  ql/src/test/queries/clientpositive/avro_create_as_select.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_create_as_select2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_no_schema_test.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_without_schema.q PRE-CREATION 
  ql/src/test/results/clientpositive/avro_create_as_select.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/avro_create_as_select2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/avro_no_schema_test.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/avro_without_schema.q.out PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 13848b6 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java 
PRE-CREATION 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
010f614 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/11925/diff/


Testing
---

Wrote a new java Test class for a new Java class. Added a new test case into 
existing java test class. In addition, there are 4 .q file for testing multiple 
use-cases.


Thanks,

Mohammad Islam

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716282#comment-13716282
 ] 

Hive QA commented on HIVE-4825:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593669/HIVE-4825.4.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/149/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/149/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests failed with: IllegalStateException: Too many bad hosts: 1.0% (10 / 10) is 
greater than threshold of 50%
{noformat}

This message is automatically generated.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716430#comment-13716430
 ] 

Brock Noland commented on HIVE-4825:


There was a large price spike in spot instances overnight. I kicked this off 
again. Also that error message needs to be cleaned up.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4907) Allow additional tests cases to be specified with -Dtestcase


[ 
https://issues.apache.org/jira/browse/HIVE-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716438#comment-13716438
 ] 

Brock Noland commented on HIVE-4907:


Yes, this is exactly what I was looking for.

 Allow additional tests cases to be specified with -Dtestcase
 

 Key: HIVE-4907
 URL: https://issues.apache.org/jira/browse/HIVE-4907
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4907.patch


 Currently we only allow a single tests case to be specified with -Dtestcase. 
 It'd be ideal if we could add on additional test cases as this would allow us 
 to batch the unit tests in ptest2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4907) Allow additional tests cases to be specified with -Dtestcase


 [ 
https://issues.apache.org/jira/browse/HIVE-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4907:
---

Status: Open  (was: Patch Available)

 Allow additional tests cases to be specified with -Dtestcase
 

 Key: HIVE-4907
 URL: https://issues.apache.org/jira/browse/HIVE-4907
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4907.patch


 Currently we only allow a single tests case to be specified with -Dtestcase. 
 It'd be ideal if we could add on additional test cases as this would allow us 
 to batch the unit tests in ptest2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen

[
https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716449#comment-13716449
]

Ashutosh Chauhan commented on HIVE-4864:

[~chenghao] Can you put these comment in Description annotation (see other udfs
for example) that way it will show up to users when they do describe on these
udfs.

Code Comments seems confused between GenericUDFCase GenericUDFWhen

Key: HIVE-4864
URL: https://issues.apache.org/jira/browse/HIVE-4864
Project: Hive
Issue Type: Task
Affects Versions: 0.9.0
Reporter: Cheng Hao
Priority: Trivial
Attachments: 1.patch

Code Comment in GenericUDFCase:
/**
* GenericUDF Class for SQL construct
* CASE WHEN a THEN b WHEN c THEN d [ELSE f] END.
*
* NOTES: 1. a and c should be boolean, or an exception will be thrown. 2. b, d
* and f should have the same TypeInfo, or an exception will be thrown.
*/
And the code comment in GenericUDFWhen:
/**
* GenericUDF Class for SQL construct CASE a WHEN b THEN c [ELSE f] END.
*
* NOTES: 1. a and b should have the same TypeInfo, or an exception will be
thrown. 2. c and f should have the same TypeInfo, or an exception will be
* thrown.
*/
From the code itself, seems the comments should be exchanged. We'd better
amend that to avoid confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4864) Code Comments seems confused between GenericUDFCase GenericUDFWhen


 [ 
https://issues.apache.org/jira/browse/HIVE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4864:
---

 Assignee: Cheng Hao
Affects Version/s: 0.10.0
   0.11.0
   Status: Open  (was: Patch Available)

 Code Comments seems confused between GenericUDFCase  GenericUDFWhen
 

 Key: HIVE-4864
 URL: https://issues.apache.org/jira/browse/HIVE-4864
 Project: Hive
  Issue Type: Task
Affects Versions: 0.11.0, 0.10.0, 0.9.0
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Trivial
 Attachments: 1.patch


 Code Comment in GenericUDFCase:
 /**
 * GenericUDF Class for SQL construct
 * CASE WHEN a THEN b WHEN c THEN d [ELSE f] END.
 * 
 * NOTES: 1. a and c should be boolean, or an exception will be thrown. 2. b, d
 * and f should have the same TypeInfo, or an exception will be thrown.
 */
 And the code comment in GenericUDFWhen:
 /**
 * GenericUDF Class for SQL construct CASE a WHEN b THEN c [ELSE f] END.
 * 
 * NOTES: 1. a and b should have the same TypeInfo, or an exception will be
  thrown. 2. c and f should have the same TypeInfo, or an exception will be
 * thrown.
 */
 From the code itself, seems the comments should be exchanged. We'd better 
 amend that to avoid confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Does HiveServer2 support delegation token?

2013-07-23 Thread Bing Li

Hi, all
HiveMetastore supports delegation token.
Does HiveServer2 support it as well? If not, do we have a plan for this?

Besides, on hive wiki
hive.server2.authentication - Authentication mode, default NONE. Options
are NONE, KERBEROS, LDAP and CUSTOM

Will HiveServer2 support PAM which could be configured to use multiple
authentication ways like OS, or LDAP as well?



Thanks,
- Bing

[jira] [Commented] (HIVE-4670) Authentication module should pass the instance part of the Kerberos principle


[ 
https://issues.apache.org/jira/browse/HIVE-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716468#comment-13716468
 ] 

Ashutosh Chauhan commented on HIVE-4670:


Primary usecase for remote user variable is for audit logging. Isn't it useful 
to have realm in there as well ?

 Authentication module should pass the instance part of the Kerberos principle
 -

 Key: HIVE-4670
 URL: https://issues.apache.org/jira/browse/HIVE-4670
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2
Affects Versions: 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-4670.2.patch, HIVE-4670.3.patch


 When Kerberos authentication is enabled for HiveServer2, the thrift SASL 
 layer passes instance@realm from the principal. It should instead strip the 
 realm and pass just the instance part of the principal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4225) HiveServer2 does not support SASL QOP

2013-07-23 Thread Joey Echeverria (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716482#comment-13716482
 ] 

Joey Echeverria commented on HIVE-4225:
---

I haven't had a chance to review HIVE-4911 but so long as it works for both HS2 
and the Metastore Server, I'm good with having an independent configuration.

 HiveServer2 does not support SASL QOP
 -

 Key: HIVE-4225
 URL: https://issues.apache.org/jira/browse/HIVE-4225
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Shims
Affects Versions: 0.11.0
Reporter: Chris Drome
Assignee: Chris Drome
 Attachments: HIVE-4225-1.patch, HIVE-4225.D10959.1.patch, 
 HIVE-4225.patch


 HiveServer2 implements Kerberos authentication through SASL framework, but 
 does not support setting QOP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4871) Apache builds fail with Target make-pom does not exist in the project hcatalog.


[ 
https://issues.apache.org/jira/browse/HIVE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716488#comment-13716488
 ] 

Ashutosh Chauhan commented on HIVE-4871:


This is still publishing hcatalog artifacts in separate namespace?

 Apache builds fail with Target make-pom does not exist in the project 
 hcatalog.
 ---

 Key: HIVE-4871
 URL: https://issues.apache.org/jira/browse/HIVE-4871
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.12.0

 Attachments: HIVE-4871.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 For example,
 https://builds.apache.org/job/Hive-trunk-h0.21/2192/console.
 All unit tests pass, but deployment of build artifacts fails.
 HIVE-4387 provided a bandaid for 0.11.  Need to figure out long term fix for 
 this for 0.12.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4871) Apache builds fail with Target make-pom does not exist in the project hcatalog.


[ 
https://issues.apache.org/jira/browse/HIVE-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716508#comment-13716508
 ] 

Eugene Koifman commented on HIVE-4871:
--

yes.  I'll change the maven coordinates when I change the package name (later 
this week)

 Apache builds fail with Target make-pom does not exist in the project 
 hcatalog.
 ---

 Key: HIVE-4871
 URL: https://issues.apache.org/jira/browse/HIVE-4871
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.12.0

 Attachments: HIVE-4871.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 For example,
 https://builds.apache.org/job/Hive-trunk-h0.21/2192/console.
 All unit tests pass, but deployment of build artifacts fails.
 HIVE-4387 provided a bandaid for 0.11.  Need to figure out long term fix for 
 this for 0.12.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4


[ 
https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716546#comment-13716546
 ] 

Ashutosh Chauhan commented on HIVE-4547:


+1

 A complex create view statement fails with new Antlr 3.4
 

 Key: HIVE-4547
 URL: https://issues.apache.org/jira/browse/HIVE-4547
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar


 A complex create view statement with CAST in join condition fails with 
 IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade 
 (HIVE-2439). The same statement works fine with Hive 0.9

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4892) PTest2 cleanup after merge


[ 
https://issues.apache.org/jira/browse/HIVE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716550#comment-13716550
 ] 

Ashutosh Chauhan commented on HIVE-4892:


There are few new files in there which looks like test logs. Are those needed ?

 PTest2 cleanup after merge
 --

 Key: HIVE-4892
 URL: https://issues.apache.org/jira/browse/HIVE-4892
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4892.patch, HIVE-4892.patch


 HIVE-4675 was merged but there are still a few minor issues we need to 
 cleanup:
 * README is out of date
 * Need to limit the number of failed source directories we copy back from the 
 slaves
 * when looking for TEST-*.xml files we look at both the log directory (good) 
 and the failed source directories (bad) therefore duplicating failures in 
 jenkins report
 * We need to process bad hosts in the finally block of PTest.run (HIVE-4882)
 * Need a mechanism to clean the ivy and maven cache (HIVE-4882)
 * PTest2 fails to publish a comment to a JIRA sometimes (HIVE-4889)
 * Now that PTest2 is committed to the source tree it's copying in our 
 TEST-SomeTest*.xml files
 Test Properties:
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.


[ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716555#comment-13716555
 ] 

Ashutosh Chauhan commented on HIVE-4611:


[~vikram.dixit] Can you create phabricator or RB entry for this?

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
 HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4892) PTest2 cleanup after merge


[ 
https://issues.apache.org/jira/browse/HIVE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716564#comment-13716564
 ] 

Brock Noland commented on HIVE-4892:


Hi,

PTest2 parses the TEST-\*.xml logs and the patch does include sample TEST 
outputs for uniting testing purposes. Unfortunately in the current version they 
are named TEST..xml in the source tree which is causing ptest2 issues when it 
finds these outputs. This patch renames them to remove the TEST prefix. There 
is also a few very small other ouptuts such as hive.log for the same purpose. 
These already exist in the source tree but are being renamed.

When we get this committed I can submit a performance improvement patch that 
should increase throughput of the pre-commit tests by 2x.

 PTest2 cleanup after merge
 --

 Key: HIVE-4892
 URL: https://issues.apache.org/jira/browse/HIVE-4892
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4892.patch, HIVE-4892.patch


 HIVE-4675 was merged but there are still a few minor issues we need to 
 cleanup:
 * README is out of date
 * Need to limit the number of failed source directories we copy back from the 
 slaves
 * when looking for TEST-*.xml files we look at both the log directory (good) 
 and the failed source directories (bad) therefore duplicating failures in 
 jenkins report
 * We need to process bad hosts in the finally block of PTest.run (HIVE-4882)
 * Need a mechanism to clean the ivy and maven cache (HIVE-4882)
 * PTest2 fails to publish a comment to a JIRA sometimes (HIVE-4889)
 * Now that PTest2 is committed to the source tree it's copying in our 
 TEST-SomeTest*.xml files
 Test Properties:
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-07-23 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716569#comment-13716569
 ] 

Vikram Dixit K commented on HIVE-4611:
--

Review board request:

https://reviews.apache.org/r/12827/

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
 HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions


[ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716573#comment-13716573
 ] 

Eric Hanson commented on HIVE-4642:
---

Hi Teddy, how's it going with this? When do you think you can finish this up?

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport


[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716612#comment-13716612
 ] 

Brock Noland commented on HIVE-4911:


Arup,

Does this work for both [HS2 and 
HMS|https://issues.apache.org/jira/browse/HIVE-4225?focusedCommentId=13716482page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13716482]?

Also, in regards to SaslQOP, is there a reason you don't use valueOf() as 
opposed to implementing fromString()?

 Enable QOP configuration for Hive Server 2 thrift transport
 ---

 Key: HIVE-4911
 URL: https://issues.apache.org/jira/browse/HIVE-4911
 Project: Hive
  Issue Type: New Feature
Reporter: Arup Malakar
Assignee: Arup Malakar
 Attachments: HIVE-4911-trunk-0.patch


 The QoP for hive server 2 should be configurable to enable encryption. A new 
 configuration should be exposed hive.server2.thrift.rpc.protection. This 
 would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4892) PTest2 cleanup after merge


[ 
https://issues.apache.org/jira/browse/HIVE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716629#comment-13716629
 ] 

Ashutosh Chauhan commented on HIVE-4892:


+1

 PTest2 cleanup after merge
 --

 Key: HIVE-4892
 URL: https://issues.apache.org/jira/browse/HIVE-4892
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4892.patch, HIVE-4892.patch


 HIVE-4675 was merged but there are still a few minor issues we need to 
 cleanup:
 * README is out of date
 * Need to limit the number of failed source directories we copy back from the 
 slaves
 * when looking for TEST-*.xml files we look at both the log directory (good) 
 and the failed source directories (bad) therefore duplicating failures in 
 jenkins report
 * We need to process bad hosts in the finally block of PTest.run (HIVE-4882)
 * Need a mechanism to clean the ivy and maven cache (HIVE-4882)
 * PTest2 fails to publish a comment to a JIRA sometimes (HIVE-4889)
 * Now that PTest2 is committed to the source tree it's copying in our 
 TEST-SomeTest*.xml files
 Test Properties:
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less

2013-07-23 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-4222:
-

Attachment: HIVE-4222.D9681.3.patch

Update patch to remove date test, since Date type does not yet exist on trunk.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2702) listPartitionsByFilter only supports string partitions for equals

2013-07-23 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716638#comment-13716638
 ] 

Phabricator commented on HIVE-2702:
---

ashutoshc has accepted the revision HIVE-2702 [jira] listPartitionsByFilter 
only supports string partitions.

  +1.
  Sergey, How did the tests go?

REVISION DETAIL
  https://reviews.facebook.net/D11715

BRANCH
  HIVE-2702

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, sershe


 listPartitionsByFilter only supports string partitions for equals
 -

 Key: HIVE-2702
 URL: https://issues.apache.org/jira/browse/HIVE-2702
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Aniket Mokashi
Assignee: Sergey Shelukhin
 Fix For: 0.12.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2702.D2043.1.patch, 
 HIVE-2702.1.patch, HIVE-2702.D11715.1.patch, HIVE-2702.D11715.2.patch, 
 HIVE-2702.D11715.3.patch, HIVE-2702.patch, HIVE-2702-v0.patch


 listPartitionsByFilter supports only non-string partitions. This is because 
 its explicitly specified in generateJDOFilterOverPartitions in 
 ExpressionTree.java. 
 //Can only support partitions whose types are string
   if( ! table.getPartitionKeys().get(partitionColumnIndex).
   
 getType().equals(org.apache.hadoop.hive.serde.Constants.STRING_TYPE_NAME) ) {
 throw new MetaException
 (Filtering is supported only on partition keys of type string);
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-07-23 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4611:
-

Status: Patch Available  (was: Open)

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
 HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-07-23 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4611:
-

Status: Open  (was: Patch Available)

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
 HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716667#comment-13716667
 ] 

Hive QA commented on HIVE-4825:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593669/HIVE-4825.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2647 tests executed
*Failed tests:*
{noformat}
org.apache.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/150/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/150/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

HIVE-4266 - Refactor HCatalog code to org.apache.hive.hcatalog

2013-07-23 Thread Eugene Koifman

I'm planning to change the package name of all hcatalog classes
sometime this week (as was promised for 0.12).
This is likely to affect any outstanding hcatalog patches on trunk.
Please try to have them checked in as soon as possible.

Thanks,
Eugene

[jira] [Commented] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less


[ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716901#comment-13716901
 ] 

Hive QA commented on HIVE-4222:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593736/HIVE-4222.D9681.3.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/151/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/151/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests failed with: IllegalStateException: Too many bad hosts: 1.0% (10 / 10) is 
greater than threshold of 50%
{noformat}

This message is automatically generated.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less


[ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716904#comment-13716904
 ] 

Brock Noland commented on HIVE-4222:


I kicked this off again. Interesting that we are seeing this twice today.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4684) Query with filter constant on left of = and column expression on right does not vectorize


 [ 
https://issues.apache.org/jira/browse/HIVE-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4684:
---

   Resolution: Fixed
Fix Version/s: vectorization-branch
   Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Jitendra!

 Query with filter constant on left of = and column expression on right does 
 not vectorize
 ---

 Key: HIVE-4684
 URL: https://issues.apache.org/jira/browse/HIVE-4684
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: Hive-4684.0.patch, Hive-4684.1.patch, HIVE-4684.1.patch, 
 HIVE-4684.2.patch, HIVE-4684.3.patch


 select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid + 1;
 Does not go down the vectorization path.
 Output:
 hive select dmachineid from factsqlengineam_vec_orc where 1073 = dmachineid 
 + 1;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Validating if vectorized execution is applicable
 Cannot vectorize the plan: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassNotFoundException: org.apache.hadoop.hiv
 e.ql.exec.vector.expressions.gen.FilterLongScalarEqualLongColumn
 Starting Job = job_201306061504_0038, Tracking URL = 
 http://localhost:50030/jobdetails.jsp?jobid=job_201306061504_0038
 Kill Command = c:\Hadoop\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd job  -kill 
 job_201306061504_0038
 Hadoop job information for Stage-1: number of mappers: 8; number of reducers:  0
 2013-06-07 10:25:30,932 Stage-1 map = 0%,  reduce = 0%
 2013-06-07 10:25:39,953 Stage-1 map = 25%,  reduce = 0%
 2013-06-07 10:25:42,959 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 2013-06-07 10:25:43,962 Stage-1 map = 49%,  reduce = 0%, Cumulative CPU 8.172 
 sec
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less


[ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717501#comment-13717501
 ] 

Hive QA commented on HIVE-4222:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593736/HIVE-4222.D9681.3.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/152/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/152/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests failed with: IllegalStateException: Too many bad hosts: 1.0% (10 / 10) is 
greater than threshold of 50%
{noformat}

This message is automatically generated.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4900) Fix the mismatched column names in package.jdo


[ 
https://issues.apache.org/jira/browse/HIVE-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717507#comment-13717507
 ] 

Ashutosh Chauhan commented on HIVE-4900:


I see. Than is this sufficient for us to upgrade to DN 3.x ? If yes, than it 
seems there is no need to Db upgrade (atleast for mysql). Is that correct?

 Fix the mismatched column names in package.jdo
 --

 Key: HIVE-4900
 URL: https://issues.apache.org/jira/browse/HIVE-4900
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4900.1.patch, HIVE-4900.2.patch, HIVE-4900.patch


 There are several errors in DataNucleus O-R mapping file, package.jdo, which 
 are not complained by the existing DN version. These errors may be subject to 
 future DN complaint (as experienced in HIVE-3632 and HIVE-2084). However, it 
 is still better if we fix these errors as it also creates some confusion in 
 the community.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less


[ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717514#comment-13717514
 ] 

Brock Noland commented on HIVE-4222:


Spot instance prices are instance today for the instance type we use. I've 
created HIVE-4920 to improve our handling of this.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances

Brock Noland created HIVE-4920:
--

 Summary: PTest2 spot instances should fall back on c1.xlarge and 
then on-demand instances
 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland


Today the price for m1.xlarge instances has been varying dramatically. We 
should fall back on c1.xlarge (which is more powerful and is cheaper at 
present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Any Test Program for Testing ORCFile Code?

2013-07-23 Thread Rini Kaushik


Hi All,

Is there any test program for testing ORCFile? If yes, can someone please
give instructions on how to invoke it?

I simply want to step through the code to understand it quickly.

Best,

Riini

Re: Review Request 12795: [HIVE-4827] Merge a Map-only job to its following MapReduce job with multiple inputs

2013-07-23 Thread Yin Huai



 On July 23, 2013, 2:22 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java, line 4
  https://reviews.apache.org/r/12795/diff/2/?file=324291#file324291line4
 
  Don't we still need the copyright?

We do not need the header. This copyright line was from the trunk.


 On July 23, 2013, 2:22 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java,
   line 420
  https://reviews.apache.org/r/12795/diff/2/?file=324292#file324292line420
 
  Why is it better to throw exception here than simply return?

If we only expect a TableScanOperator at here, I think it may be better to 
throw an exception instead of swallowing the error. 


 On July 23, 2013, 2:22 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java,
   line 446
  https://reviews.apache.org/r/12795/diff/2/?file=324292#file324292line446
 
  Why is this? Should work regardless, no?

This part is from trunk. I will take a look and see why we need it


- Yin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12795/#review23671
---


On July 22, 2013, 4:19 a.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12795/
 ---
 
 (Updated July 22, 2013, 4:19 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4827
 https://issues.apache.org/jira/browse/HIVE-4827
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-4827
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java 66b84ff 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
  f98878c 
   ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 7cbb1ff 
   ql/src/test/queries/clientpositive/correlationoptimizer7.q 9b18972 
   ql/src/test/queries/clientpositive/multiMapJoin2.q PRE-CREATION 
   ql/src/test/results/clientpositive/auto_join33.q.out 8fc0e84 
   ql/src/test/results/clientpositive/correlationoptimizer1.q.out db3bd78 
   ql/src/test/results/clientpositive/correlationoptimizer3.q.out cebddff 
   ql/src/test/results/clientpositive/correlationoptimizer4.q.out 285a54f 
   ql/src/test/results/clientpositive/correlationoptimizer6.q.out c40a786 
   ql/src/test/results/clientpositive/correlationoptimizer7.q.out ea54431 
   ql/src/test/results/clientpositive/multiMapJoin1.q.out 3b3eb3f 
   ql/src/test/results/clientpositive/multiMapJoin2.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/12795/diff/
 
 
 Testing
 ---
 
 Running tests.
 
 
 Thanks,
 
 Yin Huai

[jira] [Commented] (HIVE-4900) Fix the mismatched column names in package.jdo

2013-07-23 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717530#comment-13717530
 ] 

Xuefu Zhang commented on HIVE-4900:
---

That's correct. I don't think any upgrade is needed for the purpose of DN 
upgrade and HIVE-4900.2.patch is needed for that.

As mentioned in 2084, there needs an upgrade for comment column for table  
TYPE_FIELDS and COLUMNS_V2, which needs to be brought up from current 256 to 
4000. I'll create a separate JIRA for that, as it's irrelevant to DN upgrade.

 Fix the mismatched column names in package.jdo
 --

 Key: HIVE-4900
 URL: https://issues.apache.org/jira/browse/HIVE-4900
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4900.1.patch, HIVE-4900.2.patch, HIVE-4900.patch


 There are several errors in DataNucleus O-R mapping file, package.jdo, which 
 are not complained by the existing DN version. These errors may be subject to 
 future DN complaint (as experienced in HIVE-3632 and HIVE-2084). However, it 
 is still better if we fix these errors as it also creates some confusion in 
 the community.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4921) Upgrade COMMENT column size in table COLUMNS_V2 and TYPE_FIELDS from 256 to 4000

2013-07-23 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created HIVE-4921:
-

 Summary: Upgrade COMMENT column size in table COLUMNS_V2 and 
TYPE_FIELDS from 256 to 4000
 Key: HIVE-4921
 URL: https://issues.apache.org/jira/browse/HIVE-4921
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0, 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


There are three tables in Hive metastore schema having COMMENT COLUMN: 
PARTITIION_KEYS, COLUMNS_V2, and TYPE_FIELDS, and their sizes are different. 
PARTITIION_KEYS.COMMENT has a size of 4000. To be consistent, and to make it 
more reasonable, we need to promote the column in other two tables from the 
current size (256) to 4000.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4900) Fix the mismatched column names in package.jdo


[ 
https://issues.apache.org/jira/browse/HIVE-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717545#comment-13717545
 ] 

Ashutosh Chauhan commented on HIVE-4900:


Make sense. Lets limit this jira for this fix only. This looks good to me. Do 
you plan to run more tests or is this ready to get in?

 Fix the mismatched column names in package.jdo
 --

 Key: HIVE-4900
 URL: https://issues.apache.org/jira/browse/HIVE-4900
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4900.1.patch, HIVE-4900.2.patch, HIVE-4900.patch


 There are several errors in DataNucleus O-R mapping file, package.jdo, which 
 are not complained by the existing DN version. These errors may be subject to 
 future DN complaint (as experienced in HIVE-3632 and HIVE-2084). However, it 
 is still better if we fix these errors as it also creates some confusion in 
 the community.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances

2013-07-23 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717549#comment-13717549
 ] 

Edward Capriolo commented on HIVE-4920:
---

What is the daily cost for running? With the backlog of patches we have we may 
be running for a bit.

 PTest2 spot instances should fall back on c1.xlarge and then on-demand 
 instances
 

 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical

 Today the price for m1.xlarge instances has been varying dramatically. We 
 should fall back on c1.xlarge (which is more powerful and is cheaper at 
 present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4900) Fix the mismatched column names in package.jdo


[ 
https://issues.apache.org/jira/browse/HIVE-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717557#comment-13717557
 ] 

Ashutosh Chauhan commented on HIVE-4900:


Cool. +1 

 Fix the mismatched column names in package.jdo
 --

 Key: HIVE-4900
 URL: https://issues.apache.org/jira/browse/HIVE-4900
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4900.1.patch, HIVE-4900.2.patch, HIVE-4900.patch


 There are several errors in DataNucleus O-R mapping file, package.jdo, which 
 are not complained by the existing DN version. These errors may be subject to 
 future DN complaint (as experienced in HIVE-3632 and HIVE-2084). However, it 
 is still better if we fix these errors as it also creates some confusion in 
 the community.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Any Test Program for Testing ORCFile Code?

2013-07-23 Thread Prasanth Jayachandran

Hi Riini

The source code for ORC is in org.apache.hadoop.hive.ql.io.orc package inside 
ql/src/java folder. The test cases corresponding to ORC are in ql/src/test 
folder under the same package structure. TestOrcFile should be a good starting 
point (for debugging) which reads/writes nested struct in ORC format. 

Thanks
Prasanth

On Jul 23, 2013, at 12:32 PM, Rini Kaushik kaush...@us.ibm.com wrote:

 
 Hi All,
 
 Is there any test program for testing ORCFile? If yes, can someone please
 give instructions on how to invoke it?
 
 I simply want to step through the code to understand it quickly.
 
 Best,
 
 Riini

[jira] [Commented] (HIVE-4900) Fix the mismatched column names in package.jdo

2013-07-23 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717555#comment-13717555
 ] 

Xuefu Zhang commented on HIVE-4900:
---

I completed my testing and I think it's ready.

 Fix the mismatched column names in package.jdo
 --

 Key: HIVE-4900
 URL: https://issues.apache.org/jira/browse/HIVE-4900
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4900.1.patch, HIVE-4900.2.patch, HIVE-4900.patch


 There are several errors in DataNucleus O-R mapping file, package.jdo, which 
 are not complained by the existing DN version. These errors may be subject to 
 future DN complaint (as experienced in HIVE-3632 and HIVE-2084). However, it 
 is still better if we fix these errors as it also creates some confusion in 
 the community.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances


[ 
https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717581#comment-13717581
 ] 

Brock Noland commented on HIVE-4920:


It varies quite a bit but I think it averages to about $1-2 per hour. The past 
few days there have been large spikes in the prices of spot instances which 
results in our slave instances being terminated. See attached.

c1.xlarge seems to be more stable but also in less supply. I think that I will 
change to using c1.xlarge and then if fallback to m1.xlarge and then to 
on-demand instances.

 PTest2 spot instances should fall back on c1.xlarge and then on-demand 
 instances
 

 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical
 Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png


 Today the price for m1.xlarge instances has been varying dramatically. We 
 should fall back on c1.xlarge (which is more powerful and is cheaper at 
 present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4920) PTest2 spot instances should fall back on c1.xlarge and then on-demand instances


 [ 
https://issues.apache.org/jira/browse/HIVE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4920:
---

Attachment: Screen Shot 2013-07-23 at 3.35.00 PM.png

 PTest2 spot instances should fall back on c1.xlarge and then on-demand 
 instances
 

 Key: HIVE-4920
 URL: https://issues.apache.org/jira/browse/HIVE-4920
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Critical
 Attachments: Screen Shot 2013-07-23 at 3.35.00 PM.png


 Today the price for m1.xlarge instances has been varying dramatically. We 
 should fall back on c1.xlarge (which is more powerful and is cheaper at 
 present) and then on on-demand instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4123) The RLE encoding for ORC can be improved

2013-07-23 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717596#comment-13717596
 ] 

Owen O'Malley commented on HIVE-4123:
-

{quote}
1) In the current implementation, I kept the delta base field as optional (used 
only for fixed delta runs) and zigzag encoded the delta blob so that we don't 
have to deal with sign of the deltas.
I can change delta base field to mandatory field to store the base (absolute 
min) value of delta values and zigzag encode it. With base value and delta base 
value, we should be able to identify if the sequence is monotonically 
increasing or decreasing and also we can identify the sign of the delta values. 
I hope this is what you are looking for. Please correct me if my understanding 
is wrong.
{quote}

I think it will be worthwhile always having the delta base and keeping the 
additional delta as an unsigned remainder.

{quote}
2) is there any way we can reuse the Orc's MAJOR and MINOR version as supported 
in HIVE-4724 to figure out if we need use new integer encoding or old integer 
encoding?
{quote}
Yeah, I need to add more framework for that code. I'm leaning toward passing in 
a factory object that creates the right integer encoder.

 The RLE encoding for ORC can be improved
 

 Key: HIVE-4123
 URL: https://issues.apache.org/jira/browse/HIVE-4123
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Prasanth J
 Attachments: HIVE-4123.1.git.patch.txt, HIVE-4123.2.git.patch.txt, 
 ORC-Compression-Ratio-Comparison.xlsx


 The run length encoding of integers can be improved:
 * tighter bit packing
 * allow delta encoding
 * allow longer runs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4922) create template for string scalar compared with string column

Eric Hanson created HIVE-4922:
-

 Summary: create template for string scalar compared with string 
column
 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson


Create a template to generate classes to handle comparisons with a scalar on 
the left and a column on the right.

This allows queries similar to the following to run vectorized:

select l_orderkey, l_shipmode 
from lineitem_orc 
where l_orderkey = 1 and 'M'  l_shipmode;



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HIVE-4922) create template for string scalar compared with string column


 [ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-4922 started by Eric Hanson.

 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson

 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4922) create template for string scalar compared with string column


[ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717628#comment-13717628
 ] 

Eric Hanson commented on HIVE-4922:
---

I verified that comparisons with a scalar on the left and a column on the right 
run end-to-end, using these queries, which all run vectorized:

select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
= l_shipmode;
select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
 l_shipmode;
select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
= l_shipmode;
select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
= l_shipmode;
select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
 l_shipmode;
select l_orderkey, l_shipmode from lineitem_orc where l_orderkey = 1 and 'MAIL' 
 l_shipmode;


 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson

 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4922) create template for string scalar compared with string column


 [ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4922:
--

Status: Patch Available  (was: In Progress)

 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4922.1.patch


 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4922) create template for string scalar compared with string column


 [ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4922:
--

Attachment: HIVE-4922.1.patch

 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4922.1.patch


 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4922) create template for string scalar compared with string column


[ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717630#comment-13717630
 ] 

Eric Hanson commented on HIVE-4922:
---

This patch was created after the following patches were applied, so to be safe, 
it should wait to go in until they've been applied first.

HIVE-4684.3.patch
HIVE-4884-.patch
Hive-4909.0.patch

 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4922.1.patch


 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4922) create template for string scalar compared with string column


[ 
https://issues.apache.org/jira/browse/HIVE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717654#comment-13717654
 ] 

Hive QA commented on HIVE-4922:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593776/HIVE-4922.1.patch

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/153/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/153/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests failed with: NonZeroExitCodeException: Command 'bash 
/data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and 
output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-153/source-prep.txt
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf build hcatalog/build hcatalog/core/build 
hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build 
hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build 
hcatalog/hcatalog-pig-adapter/build common/src/gen
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1506302.

At revision 1506302.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0 to p2
+ exit 1
'
{noformat}

This message is automatically generated.

 create template for string scalar compared with string column
 -

 Key: HIVE-4922
 URL: https://issues.apache.org/jira/browse/HIVE-4922
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-4922.1.patch


 Create a template to generate classes to handle comparisons with a scalar on 
 the left and a column on the right.
 This allows queries similar to the following to run vectorized:
 select l_orderkey, l_shipmode 
 from lineitem_orc 
 where l_orderkey = 1 and 'M'  l_shipmode;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4624) Integrate Vectorzied Substr into Vectorized QE


[ 
https://issues.apache.org/jira/browse/HIVE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717657#comment-13717657
 ] 

Eric Hanson commented on HIVE-4624:
---

Tim will try to get this done by Thurs or else explicitly give it to somebody 
else to finish.

 Integrate Vectorzied Substr into Vectorized QE
 --

 Key: HIVE-4624
 URL: https://issues.apache.org/jira/browse/HIVE-4624
 Project: Hive
  Issue Type: Sub-task
Reporter: Timothy Chen
Assignee: Timothy Chen

 Need to hook up the Vectorized Substr directly into Hive Vectorized QE so it 
 can be leveraged.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4512) The vectorized plan is not picking right expression class for string concatenation.


 [ 
https://issues.apache.org/jira/browse/HIVE-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-4512:
-

Assignee: Eric Hanson  (was: Jitendra Nath Pandey)

 The vectorized plan is not picking right expression class for string 
 concatenation.
 ---

 Key: HIVE-4512
 URL: https://issues.apache.org/jira/browse/HIVE-4512
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Eric Hanson

 The vectorized plan is not picking right expression class for string 
 concatenation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-23 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/#review23711
---

data/conf/hive-site.xml
https://reviews.apache.org/r/12824/#comment47589

This change should go into conf/hive-default.xml.template .
data/conf/hive-site.xml is meant to be used for overriding config
parameters for the tests. In this case as default value is being used, this
file does not need changing.

jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/12824/#comment47597

the HIVE_AUTH_TYPE env variable is called auth.
Should we use something more descriptive like sasl.qop as the variable
that sets the QOP level.

jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/12824/#comment47590

It is a good general practice to chain the exceptions.
-
throw new SQLException(Invalid + HIVE_AUTH_TYPE + parameter. +
e.getMessage(), 42000, e);

service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java
https://reviews.apache.org/r/12824/#comment47596

I think hadoop.rpc.protection being set to a higher level than
hive.server2.thrift.rpc.protection does not make sense in most situations (you
would want to have more security in the transport that is likely to be more
unsecure. THe HS2 - client transport could be over a corporate wide wi-fi
network)

Should we warn if such a configuration is seen ?

shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
https://reviews.apache.org/r/12824/#comment47595

This function is called from hive metastore client. Using
SaslRpcServer.SASL_PROPS here means that setting hadoop.rpc.protection will
determine the QOP level, if we make a call to SaslRpcServer.init(conf) from
anywhere in the code. But that function is not being called.

I think it makes sense to use hadoop.rpc.protection for metastore QOP,
since metastore usually not exposed 'outside' the cluster unlike hive server2.
It is often viewed as something 'inside the cluster'.

Should we change this function to take in a configuration object and use
that to call SaslRpcServer.init(conf) ?

- Thejas Nair

On July 22, 2013, 8:56 p.m., Arup Malakar wrote:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/
---

(Updated July 22, 2013, 8:56 p.m.)

Review request for hive.

Bugs: HIVE-4911
https://issues.apache.org/jira/browse/HIVE-4911

Repository: hive-git

Description
---

The QoP for hive server 2 should be configurable to enable encryption. A new
configuration should be exposed hive.server2.thrift.rpc.protection. This
would give greater control configuring hive server 2 service.

Diffs
-

common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
11c31216495d0c4e454f2627af5c93a9f270b1fe
data/conf/hive-site.xml 4e6ff16135833da1a4df12a12a6fe59ad4f870ba
jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
00f43511b478c687b7811fc8ad66af2b507a3626
service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java
1809e1b26ceee5de14a354a0e499aa8c0ab793bf
service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java
379dafb8377aed55e74f0ae18407996bb9e1216f
service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION

shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
777226f8da0af2235d4294cd6a676fa8192c89e4

shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
9b0ec0a75563b41339e6fc747556440fdf83e31e

Diff: https://reviews.apache.org/r/12824/diff/

Testing
---

Thanks,

Arup Malakar

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717661#comment-13717661
 ] 

Gunther Hagleitner commented on HIVE-4825:
--

Surprised to see this hcat test fail - shouldn't be affected by the changes. 
Ran it in isolation a few times and also ran all hcat tests combined. Couldn't 
reproduce the issue. Fluke?

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4825) Separate MapredWork into MapWork and ReduceWork


[ 
https://issues.apache.org/jira/browse/HIVE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717667#comment-13717667
 ] 

Brock Noland commented on HIVE-4825:


Yes I believe that one is flaky. I added it to HIVE-4851

 Separate MapredWork into MapWork and ReduceWork
 ---

 Key: HIVE-4825
 URL: https://issues.apache.org/jira/browse/HIVE-4825
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-4825.1.patch, HIVE-4825.2.code.patch, 
 HIVE-4825.2.testfiles.patch, HIVE-4825.3.testfiles.patch, HIVE-4825.4.patch


 Right now all the information needed to run an MR job is captured in 
 MapredWork. This class has aliases, tagging info, table descriptors etc.
 For Tez and MRR it will be useful to break this into map and reduce specific 
 pieces. The separation is natural and I think has value in itself, it makes 
 the code easier to understand. However, it will also allow us to reuse these 
 abstractions in Tez where you'll have a graph of these instead of just 1M and 
 0-1R.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4851) Fix flaky tests


 [ 
https://issues.apache.org/jira/browse/HIVE-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4851:
---

Description: 
I see the following tests fail quite often:

* 
TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
* TestOrcHCatLoader.testReadDataBasic
* TestMinimrCliDriver.testCliDriver_bucketmpjoin6
* TestNotificationListener.testAMQListener

This one is less often, but still fails randomly:
* TestMinimrCliDriver.testCliDriver_bucket4
* TestHCatHiveCompatibility.testUnpartedReadWrite
* TestHCatLoader.testReadPartitionedBasic
* TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
* TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask

  was:
I see the following tests fail quite often:

* 
TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
* TestOrcHCatLoader.testReadDataBasic
* TestMinimrCliDriver.testCliDriver_bucketmpjoin6
* TestNotificationListener.testAMQListener

This one is less often, but still fails randomly:
* TestMinimrCliDriver.testCliDriver_bucket4
* TestHCatHiveCompatibility.testUnpartedReadWrite
* TestHCatLoader.testReadPartitionedBasic
* TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat



 Fix flaky tests
 ---

 Key: HIVE-4851
 URL: https://issues.apache.org/jira/browse/HIVE-4851
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland

 I see the following tests fail quite often:
 * 
 TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
 * TestOrcHCatLoader.testReadDataBasic
 * TestMinimrCliDriver.testCliDriver_bucketmpjoin6
 * TestNotificationListener.testAMQListener
 This one is less often, but still fails randomly:
 * TestMinimrCliDriver.testCliDriver_bucket4
 * TestHCatHiveCompatibility.testUnpartedReadWrite
 * TestHCatLoader.testReadPartitionedBasic
 * TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
 * TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-23 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/#review23722
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/12824/#comment47598

should we just call this 
hive.server2.thrift.sasl.qop ? That seems more self describing.



- Thejas Nair


On July 22, 2013, 8:56 p.m., Arup Malakar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12824/
 ---
 
 (Updated July 22, 2013, 8:56 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4911
 https://issues.apache.org/jira/browse/HIVE-4911
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The QoP for hive server 2 should be configurable to enable encryption. A new 
 configuration should be exposed hive.server2.thrift.rpc.protection. This 
 would give greater control configuring hive server 2 service.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
 11c31216495d0c4e454f2627af5c93a9f270b1fe 
   data/conf/hive-site.xml 4e6ff16135833da1a4df12a12a6fe59ad4f870ba 
   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
 00f43511b478c687b7811fc8ad66af2b507a3626 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
 1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
   service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
 379dafb8377aed55e74f0ae18407996bb9e1216f 
   service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
   
 shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
  777226f8da0af2235d4294cd6a676fa8192c89e4 
   
 shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
  9b0ec0a75563b41339e6fc747556440fdf83e31e 
 
 Diff: https://reviews.apache.org/r/12824/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Arup Malakar

[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

[
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717669#comment-13717669
]

Thejas M Nair commented on HIVE-4911:
-

[~amalakar] I added some review comments in review board link.

+1 for having a separate config flag that enables the QOP for hive server2. HS2
- client connection is usually more vulnerable compared to the network traffic
within a hadoop cluster, as the HS2 client is likely to be connecting over a
corporate wide network.

[~brocknoland] The patch would not work for HMS, that would new some more
change. (added a comment about that in review). But I am not sure if that needs
to be part of same jira.

I don't think it makes sense to use the same config param to set the SASL QOP
level for metastore. Should we just use hadoop.rpc.protection for that, as it
is usually considered as 'inside the cluster' (as opposed to HS2 which is like
a 'gateway server')

Enable QOP configuration for Hive Server 2 thrift transport
---

Key: HIVE-4911
URL: https://issues.apache.org/jira/browse/HIVE-4911
Project: Hive
Issue Type: New Feature
Reporter: Arup Malakar
Assignee: Arup Malakar
Attachments: HIVE-4911-trunk-0.patch

[jira] [Updated] (HIVE-4915) unit tests fail on windows because of difference in input file size


 [ 
https://issues.apache.org/jira/browse/HIVE-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4915:


Attachment: HIVE-4915.1.patch

HIVE-4915.1.patch - update .gitattributes file to always checkout the test .dat 
files with unix style newlines


 unit tests fail on windows because of difference in input file size
 ---

 Key: HIVE-4915
 URL: https://issues.apache.org/jira/browse/HIVE-4915
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4915.1.patch


 Several qfile based tests fail on windows because in the output of explain 
 extended, the total file size of input files shown is different on windows.
 This is because by default text files on windows are checked out with two 
 char line endings, and *.dat files used as input files for the tables are 
 considered as text files. So for every line in the .dat file, the size of the 
 file is larger by 1 byte on windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4915) unit tests fail on windows because of difference in input file size


 [ 
https://issues.apache.org/jira/browse/HIVE-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4915:


Status: Patch Available  (was: Open)

 unit tests fail on windows because of difference in input file size
 ---

 Key: HIVE-4915
 URL: https://issues.apache.org/jira/browse/HIVE-4915
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4915.1.patch


 Several qfile based tests fail on windows because in the output of explain 
 extended, the total file size of input files shown is different on windows.
 This is because by default text files on windows are checked out with two 
 char line endings, and *.dat files used as input files for the tables are 
 considered as text files. So for every line in the .dat file, the size of the 
 file is larger by 1 byte on windows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2


[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717688#comment-13717688
 ] 

Brock Noland commented on HIVE-4388:


Facts:
* HBase only plans on publishing 0.95/0.96 hadoop2 artifacts
* HBase 0.95/0.96 makes backwards incompatible changes
* HBase 0.95/0.96 changes the coprocessor interface dramatically

Based on these facts I feel it would be very difficult to support both 0.94 and 
0.95/0.96 with the same source code. I see two options:

# Move the hbase stuff in a versioned module similar to the hadoop shim
# Upgrade trunk to 0.96

I propose we upgrade trunk to 0.95/0.96 and move on with our lives. Supporting 
two versions of hbase in addition to three versions of hadoop is going to be 
ugly quick.


 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Brock Noland

 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4808) WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly


 [ 
https://issues.apache.org/jira/browse/HIVE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4808:
-

Attachment: HIVE-4808.1.patch

Added a test for this case.
Ran Templeton e2e tests.
fork.factor.group=3 and fork.factor.conf.file=6 the suite runs in 11 minutes.

Added support for timeout_seconds property in .conf files to specify custom 
timeout.

 WebHCat job submission is killed by TaskTracker since it's not sending a 
 heartbeat properly
 ---

 Key: HIVE-4808
 URL: https://issues.apache.org/jira/browse/HIVE-4808
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.12.0

 Attachments: HIVE-4808.1.patch, HIVE-4808.patch


 (set mapred.task.timeout=7)
 curl -i -d user.name=ekoifman \
 -d jar=/user/ekoifman/webhcate2e/hexamples.jar \
 -d class=sleep \
 -d arg=-mt \
 -d arg=5 \
 -d statusdir=/tmp \
 'http://localhost:50111/templeton/v1/mapreduce/jar'
 The TempletonControllerJob gets retried 4 times (Thus there are 4 SleepJob 
 invocations) with message that it was killed due to inactivity.
 hexamples.jar = hadoop-examples-*.jar

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717703#comment-13717703
 ] 

Ashutosh Chauhan commented on HIVE-4838:


[~brocknoland] One of the item listed in description is:
* Uses static state via the MapJoinMetaData class to pass serialization 
metadata to the Key, Row classes.

Have you attacked this in this patch? If yes, how did you fix it. I haven't 
dived into the patch to figure that out yet.

 Refactor MapJoin HashMap code to improve testability and readability
 

 Key: HIVE-4838
 URL: https://issues.apache.org/jira/browse/HIVE-4838
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4838.patch, HIVE-4838.patch


 MapJoin is an essential component for high performance joins in Hive and the 
 current code has done great service for many years. However, the code is 
 showing it's age and currently suffers  from the following issues:
 * Uses static state via the MapJoinMetaData class to pass serialization 
 metadata to the Key, Row classes.
 * The api of a logical Table Container is not defined and therefore it's 
 unclear what apis HashMapWrapper 
 needs to publicize. Additionally HashMapWrapper has many used public methods.
 * HashMapWrapper contains logic to serialize, test memory bounds, and 
 implement the table container. Ideally these logical units could be seperated
 * HashTableSinkObjectCtx has unused fields and unused methods
 * CommonJoinOperator and children use ArrayList on left hand side when only 
 List is required
 * There are unused classes MRU, DCLLItemm and classes which duplicate 
 functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4831) QTestUtil based test exiting abnormally on windows fails startup of other QTestUtil tests


 [ 
https://issues.apache.org/jira/browse/HIVE-4831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4831:


Status: Patch Available  (was: Open)

 QTestUtil based test exiting abnormally on windows fails startup of other 
 QTestUtil tests
 -

 Key: HIVE-4831
 URL: https://issues.apache.org/jira/browse/HIVE-4831
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4831.1.patch, HIVE-4831.2.patch


 QTestUtil tests start mini zookeeper cluster. If it exits abnormally (eg 
 timeout), it fails to stop the zookeeper mini cluster. On Windows when the 
 process is still running the files can't be deleted, and as a result the new 
 zookeeper cluster started by a new QFileUtil based test case fails to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2


[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717707#comment-13717707
 ] 

Ashutosh Chauhan commented on HIVE-4388:


I am also of same opinion, lets move forward  and upgrade trunk to 0.96.

 HBase tests fail against Hadoop 2
 -

 Key: HIVE-4388
 URL: https://issues.apache.org/jira/browse/HIVE-4388
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Brock Noland

 Currently we're building by default against 0.92. When you run against hadoop 
 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
 HIVE-3861 upgrades the version of hbase used. This will get you past the 
 problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717716#comment-13717716
 ] 

Brock Noland commented on HIVE-4838:


Hey,

Yes I have. I'll upload an updated patch here in a few minutes. The current 
code is using this static code because by using java serialization there is no 
way to pass any context information down to the class when the read/write 
methods are being called. In the new patch I define my own read/write methods 
(example below).

{noformat}
public void read(MapJoinObjectSerDeContext context, ObjectInputStream in, 
Writable container) 
throws IOException, SerDeException {
{noformat}

and use those to serialize/deserialize the objects. Specifically in the new 
patch MapJoinRowContainer.read/write, MapJoinTableContainerSerDe.load/persist 
and MapJoinKey.read/write will be interesting.

 Refactor MapJoin HashMap code to improve testability and readability
 

 Key: HIVE-4838
 URL: https://issues.apache.org/jira/browse/HIVE-4838
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4838.patch, HIVE-4838.patch


 MapJoin is an essential component for high performance joins in Hive and the 
 current code has done great service for many years. However, the code is 
 showing it's age and currently suffers  from the following issues:
 * Uses static state via the MapJoinMetaData class to pass serialization 
 metadata to the Key, Row classes.
 * The api of a logical Table Container is not defined and therefore it's 
 unclear what apis HashMapWrapper 
 needs to publicize. Additionally HashMapWrapper has many used public methods.
 * HashMapWrapper contains logic to serialize, test memory bounds, and 
 implement the table container. Ideally these logical units could be seperated
 * HashTableSinkObjectCtx has unused fields and unused methods
 * CommonJoinOperator and children use ArrayList on left hand side when only 
 List is required
 * There are unused classes MRU, DCLLItemm and classes which duplicate 
 functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability


 [ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4838:
---

Attachment: HIVE-4838.patch

Rebased after HIVE-4845 was committed.

 Refactor MapJoin HashMap code to improve testability and readability
 

 Key: HIVE-4838
 URL: https://issues.apache.org/jira/browse/HIVE-4838
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch


 MapJoin is an essential component for high performance joins in Hive and the 
 current code has done great service for many years. However, the code is 
 showing it's age and currently suffers  from the following issues:
 * Uses static state via the MapJoinMetaData class to pass serialization 
 metadata to the Key, Row classes.
 * The api of a logical Table Container is not defined and therefore it's 
 unclear what apis HashMapWrapper 
 needs to publicize. Additionally HashMapWrapper has many used public methods.
 * HashMapWrapper contains logic to serialize, test memory bounds, and 
 implement the table container. Ideally these logical units could be seperated
 * HashTableSinkObjectCtx has unused fields and unused methods
 * CommonJoinOperator and children use ArrayList on left hand side when only 
 List is required
 * There are unused classes MRU, DCLLItemm and classes which duplicate 
 functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717719#comment-13717719
 ] 

Brock Noland commented on HIVE-4838:


Updated review https://reviews.facebook.net/D11679

 Refactor MapJoin HashMap code to improve testability and readability
 

 Key: HIVE-4838
 URL: https://issues.apache.org/jira/browse/HIVE-4838
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch


 MapJoin is an essential component for high performance joins in Hive and the 
 current code has done great service for many years. However, the code is 
 showing it's age and currently suffers  from the following issues:
 * Uses static state via the MapJoinMetaData class to pass serialization 
 metadata to the Key, Row classes.
 * The api of a logical Table Container is not defined and therefore it's 
 unclear what apis HashMapWrapper 
 needs to publicize. Additionally HashMapWrapper has many used public methods.
 * HashMapWrapper contains logic to serialize, test memory bounds, and 
 implement the table container. Ideally these logical units could be seperated
 * HashTableSinkObjectCtx has unused fields and unused methods
 * CommonJoinOperator and children use ArrayList on left hand side when only 
 List is required
 * There are unused classes MRU, DCLLItemm and classes which duplicate 
 functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4822) implement vectorized math functions


 [ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4822:
--

Attachment: HIVE-4822.5.patch

 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, HIVE-4822.5.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working

2013-07-23 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717747#comment-13717747
 ] 

Phabricator commented on HIVE-3926:
---

hagleitn has commented on the revision HIVE-3926 [jira] PPD on virtual column 
of partitioned table is not working.

  Some minor comments + request for more info.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java:600 tableScanOperator was 
actually a clearer name, wasn't it?
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java:108 I think this 
is redundant. o instanceof MapInputPath is false if o == null
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:134 
can you please add a javadoc comment here?
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java:18 There are a 
lot of changes in MapOperator. I've scan through them and it seems most (all?) 
are just cleaning up the operator (which is great).

  Is that correct? If not can you please point out what the important changes 
are?

REVISION DETAIL
  https://reviews.facebook.net/D8121

To: JIRA, navis
Cc: hagleitn


 PPD on virtual column of partitioned table is not working
 -

 Key: HIVE-3926
 URL: https://issues.apache.org/jira/browse/HIVE-3926
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3926.D8121.1.patch, HIVE-3926.D8121.2.patch, 
 HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch


 {code}
 select * from src where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 is working, but
 {code}
 select * from srcpart where BLOCK__OFFSET__INSIDE__FILE100;
 {code}
 throws SemanticException. Disabling PPD makes it work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4822) implement vectorized math functions


 [ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4822:
--

Attachment: (was: HIVE-4822.5.patch)

 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, 
 HIVE-4822.5-vectorization.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4822) implement vectorized math functions


 [ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-4822:
--

Attachment: HIVE-4822.5-vectorization.patch

 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, 
 HIVE-4822.5-vectorization.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4822) implement vectorized math functions


[ 
https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717753#comment-13717753
 ] 

Eric Hanson commented on HIVE-4822:
---

I updated this patch with a small bug fix to call initBuffer() on string output 
vectors. I also updated the unit test accordingly.

 implement vectorized math functions
 ---

 Key: HIVE-4822
 URL: https://issues.apache.org/jira/browse/HIVE-4822
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, 
 HIVE-4822.5-vectorization.patch


 Implement vectorized support for the all the built-in math functions. This 
 includes implementing the vectorized operation, and tying it all together in 
 VectorizationContext so it runs end-to-end. These functions include:
 round(Col)
 Round(Col, N)
 Floor(Col)
 Ceil(Col)
 Rand(), Rand(seed)
 Exp(Col)
 Ln(Col)
 Log10(Col)
 Log2(Col)
 Log(base, Col)
 Pow(col, p), Power(col, p)
 Sqrt(Col)
 Bin(Col)
 Hex(Col)
 Unhex(Col)
 Conv(Col, from_base, to_base)
 Abs(Col)
 Pmod(arg1, arg2)
 Sin(Col)
 Asin(Col)
 Cos(Col)
 ACos(Col)
 Atan(Col)
 Degrees(Col)
 Radians(Col)
 Positive(Col)
 Negative(Col)
 Sign(Col)
 E()
 Pi()
 To reduce the total code volume, do an implicit type cast from non-double 
 input types to double. 
 Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so 
 reuse code for those as appropriate.
 Try to call the function directly in the inner loop and avoid new() or 
 expensive operations, as appropriate.
 Templatize the code where appropriate, e.g. all the unary function of form 
 DOUBLE func(DOUBLE)
 can probably be done with a template.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4808) WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly


 [ 
https://issues.apache.org/jira/browse/HIVE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4808:
-

Attachment: HIVE-4808.1.patch

 WebHCat job submission is killed by TaskTracker since it's not sending a 
 heartbeat properly
 ---

 Key: HIVE-4808
 URL: https://issues.apache.org/jira/browse/HIVE-4808
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.12.0

 Attachments: HIVE-4808.1.patch, HIVE-4808.1.patch, HIVE-4808.patch


 (set mapred.task.timeout=7)
 curl -i -d user.name=ekoifman \
 -d jar=/user/ekoifman/webhcate2e/hexamples.jar \
 -d class=sleep \
 -d arg=-mt \
 -d arg=5 \
 -d statusdir=/tmp \
 'http://localhost:50111/templeton/v1/mapreduce/jar'
 The TempletonControllerJob gets retried 4 times (Thus there are 4 SleepJob 
 invocations) with message that it was killed due to inactivity.
 hexamples.jar = hadoop-examples-*.jar

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4808) WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly


 [ 
https://issues.apache.org/jira/browse/HIVE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4808:
-

Attachment: (was: HIVE-4808.1.patch)

 WebHCat job submission is killed by TaskTracker since it's not sending a 
 heartbeat properly
 ---

 Key: HIVE-4808
 URL: https://issues.apache.org/jira/browse/HIVE-4808
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.12.0

 Attachments: HIVE-4808.1.patch, HIVE-4808.patch


 (set mapred.task.timeout=7)
 curl -i -d user.name=ekoifman \
 -d jar=/user/ekoifman/webhcate2e/hexamples.jar \
 -d class=sleep \
 -d arg=-mt \
 -d arg=5 \
 -d statusdir=/tmp \
 'http://localhost:50111/templeton/v1/mapreduce/jar'
 The TempletonControllerJob gets retried 4 times (Thus there are 4 SleepJob 
 invocations) with message that it was killed due to inactivity.
 hexamples.jar = hadoop-examples-*.jar

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4222) Timestamp type constants cannot be deserialized in JDK 1.6 or less


[ 
https://issues.apache.org/jira/browse/HIVE-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717782#comment-13717782
 ] 

Hive QA commented on HIVE-4222:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12593736/HIVE-4222.D9681.3.patch

{color:green}SUCCESS:{color} +1 2648 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/154/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/154/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.CleanupPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

 Timestamp type constants cannot be deserialized in JDK 1.6 or less
 --

 Key: HIVE-4222
 URL: https://issues.apache.org/jira/browse/HIVE-4222
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4222.D9681.1.patch, HIVE-4222.D9681.2.patch, 
 HIVE-4222.D9681.3.patch


 For example,
 {noformat}
 ExprNodeConstantDesc constant = new 
 ExprNodeConstantDesc(TypeInfoFactory.timestampTypeInfo, new Timestamp(100));
 String serialized = Utilities.serializeExpression(constant);
 ExprNodeConstantDesc deserilized = (ExprNodeConstantDesc) 
 Utilities.deserializeExpression(serialized, new Configuration());
 {noformat}
 logs error message
 {noformat}
 java.lang.InstantiationException: java.sql.Timestamp
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 {noformat}
 and makes NPE in final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4808) WebHCat job submission is killed by TaskTracker since it's not sending a heartbeat properly