from:"Vikram Dixit K"


[ 
https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320492#comment-14320492
 ] 

Vikram Dixit K commented on HIVE-9683:
--

+1 for 1.0 branch.

 Hive metastore thrift client connections hang indefinitely
 --

 Key: HIVE-9683
 URL: https://issues.apache.org/jira/browse/HIVE-9683
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0, 1.0.1
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.0.1

 Attachments: HIVE-9683.1.patch


 THRIFT-2788 fixed network-partition problems that affect Thrift client 
 connections.
 Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround 
 can be applied to prevent indefinite connection hangs during net-splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Affects Version/s: 1.0.0

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Fix For: 1.2.0

 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~apivovarov]!

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9523) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied


 [ 
https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9523:
-
Labels: gsoc2015  (was: )

 when columns on which tables are partitioned are used in the join condition 
 same join optimizations as for bucketed tables should be applied
 

 Key: HIVE-9523
 URL: https://issues.apache.org/jira/browse/HIVE-9523
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer, Physical Optimizer, SQL
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Maciek Kocon
  Labels: gsoc2015

 For JOIN conditions where partitioning criteria are used respectively:
 ⋮ 
 FROM TabA JOIN TabB
ON TabA.partCol1 = TabB.partCol2
AND TabA.partCol2 = TabB.partCol2
 the optimizer could/should choose to treat it the same way as with bucketed 
 tables: ⋮ 
 FROM TabC
   JOIN TabD
  ON TabC.clusteredByCol1 = TabD.clusteredByCol2
AND TabC.clusteredByCol2 = TabD.clusteredByCol2
 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join.
 This is based on fact that same way as buckets translate to separate files, 
 the partitions essentially provide the same mapping.
 When data locality is known the optimizer could focus only on joining 
 corresponding partitions rather than whole data sets.
 #side notes:
 ⦿ Currently Table DDL Syntax where Partitioning and Bucketing defined at the 
 same time is allowed:
 CREATE TABLE
  ⋮
 PARTITIONED BY(…) CLUSTERED BY(…) INTO … BUCKETS;
 But in this case optimizer never chooses to use Bucket Map Join or Sort Merge 
 Bucket Map Join which defeats the purpose of creating BUCKETed tables in such 
 scenarios. Should that be raised as a separate BUG?
 ⦿ Currently partitioning and bucketing are two separate things but serve same 
 purpose - shouldn't the concept be merged (explicit/implicit partitions?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound


 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6069:
-
Fix Version/s: 1.2.0

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 1.0.0
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Fix For: 1.2.0

 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9687) Blink DB style approximate querying in hive

Vikram Dixit K created HIVE-9687:


 Summary: Blink DB style approximate querying in hive
 Key: HIVE-9687
 URL: https://issues.apache.org/jira/browse/HIVE-9687
 Project: Hive
  Issue Type: New Feature
Reporter: Vikram Dixit K


http://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf

There are various pieces here that need to be thought through and implemented. 
For e.g. sampling offline, run-time sampling selection module etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-11 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316923#comment-14316923
 ] 

Vikram Dixit K commented on HIVE-6069:
--

+1 LGTM. I will commit this shortly.

 Improve error message in GenericUDFRound
 

 Key: HIVE-6069
 URL: https://issues.apache.org/jira/browse/HIVE-6069
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Xuefu Zhang
Assignee: Alexander Pivovarov
Priority: Trivial
 Attachments: HIVE-6069.1.patch


 Suggested in HIVE-6039 review board.
 https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [ANNOUNCE] New Hive Committers -- Chao Sun, Chengxiang Li, and Rui Li

2015-02-09 Thread Vikram Dixit K

Congrats guys!

On Mon, Feb 9, 2015 at 12:42 PM, Szehon Ho sze...@cloudera.com wrote:

 Congratulations guys !

 On Mon, Feb 9, 2015 at 3:38 PM, Jimmy Xiang jxi...@cloudera.com wrote:

  Congrats!!
 
  On Mon, Feb 9, 2015 at 12:36 PM, Alexander Pivovarov 
 apivova...@gmail.com
  
  wrote:
 
   Congrats!
  
   On Mon, Feb 9, 2015 at 12:31 PM, Carl Steinbach c...@apache.org
 wrote:
  
   The Apache Hive PMC has voted to make Chao Sun, Chengxiang Li, and Rui
  Li
   committers on the Apache Hive Project.
  
   Please join me in congratulating Chao, Chengxiang, and Rui!
  
   Thanks.
  
   - Carl
  
  
  
 




-- 
Nothing better than when appreciated for hard work.
-Mark

Proposal for having branch committers

2015-02-09 Thread Vikram Dixit K

Hi Folks,

We seem to have quite a few projects going around and in the interest of
time and the project as a whole, it seems good to have branch committers
much like what is there in the Hadoop project. I am proposing an addition
to the committer bylaws as follows (taken from the hadoop project bylaws
http://hadoop.apache.org/bylaws.html):

Significant, pervasive features are often developed in a speculative
branch of the repository. The PMC may grant commit rights on the branch to
its consistent contributors, while the initiative is active. Branch
committers are responsible for shepherding their feature into an active
release and do not cast binding votes or vetoes in the project.

I am +1 on this.

Thanks
Vikram.

Re: Created branch 1.0

2015-02-09 Thread Vikram Dixit K

The build check in HIVE-8933 fixed in HIVE-8845.

On Mon, Feb 9, 2015 at 11:32 AM, Vikram Dixit K vikram.di...@gmail.com
wrote:

 Hi Ed,

 This was the case with 0.14. It was fixed before 1.0 went out in HIVE-8933.

 Thanks
 Vikram.

 On Mon, Feb 9, 2015 at 9:08 AM, Alan Gates alanfga...@gmail.com wrote:

 That's fixed, correct?  I do not believe there were any SNAPSHOT
 dependencies in 1.0.

 Alan.

   Edward Capriolo edlinuxg...@gmail.com
  February 9, 2015 at 8:40
 Because we can not really have a stable api if by definition we build
 around snapshot dependencies.

 On Mon, Feb 9, 2015 at 11:38 AM, Edward Capriolo edlinuxg...@gmail.com
 edlinuxg...@gmail.com

   Edward Capriolo edlinuxg...@gmail.com
  February 9, 2015 at 8:38
 Question.

 https://issues.apache.org/jira/browse/HIVE-8614

 Did we not just agree in this thread that hive will no long have
 dependency
 that are SNAPSHOT?


   Brock Noland br...@cloudera.com
  January 22, 2015 at 22:06
 Hi Alan,

 I agree with Xuefu and what was suggested in your statement. I was
 thinking we'd release the next release as 0.15 and then later there
 would be 1.0 off trunk (e.g. what would have been 0.16) and thus be
 superset (minus anything we intentionally remove).

 As I have said several times, I'd like to release more often so I feel
 we could even start the 1.0 work shortly after the 0.15 release. For
 my part, I do agree with some earlier contributor/user sentiment that
 it would be good to have some basic public API defined for 1.0. I
 don't think that will be too hard as it's more or less obvious what
 our public API is today.

 Hope this seems reasonable.

 Cheers,
 Brock
   Xuefu Zhang xzh...@cloudera.com
  January 22, 2015 at 12:31
 Hi Thejas/Alan,

 From all the argument, I think there was an assumption that the proposed
 1.0 release will be imminent and 0.15 will happen far after that. Based on
 that assumption, 0.15 will become 1.1, which is greater in scope than 1.0.
 However, this assumption may not be true. The confusion will be
 significant
 if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.

 Another concern is that, the proposed release of 1.0 is a subset of of
 Hive's functionality, and for major releases users are expecting major
 improvement in functionality as well as stability. Mutating from 0.14.1
 release seems falling short in that expectation.

 Having said that, I'd think it makes more sense to release 0.15 as 0.15,
 and later we release 1.0 as the major release that supersedes any previous
 releases. That will fulfill the expectations of a major release.

 Thanks,
 Xuefu


   Alan Gates ga...@hortonworks.com
  January 22, 2015 at 12:12
  I had one clarifying question for Brock and Xuefu.  Was your proposal to
 still call the branch from trunk you are planning in a few days 0.15 (and
 hence release it as 0.15) and have 1.0 be a later release?  Or did you want
 to call what is now 0.15 1.0?  If you wanted 1.0 to be post 0.15, are you
 ok with stipulating that the next release from trunk after 0.15 (what would
 have been 0.16) is 1.0?

 Alan.




 --
 Nothing better than when appreciated for hard work.
 -Mark




-- 
Nothing better than when appreciated for hard work.
-Mark

Re: Proposal for having branch committers

2015-02-09 Thread Vikram Dixit K

Hi Folks,

Creating a new formal vote thread for this. After looking at the bylaws
page, it looks like we need to have a formal vote on it by the PMC members.

Thanks
Vikram.

On Mon, Feb 9, 2015 at 1:56 PM, Lefty Leverenz leftylever...@gmail.com
wrote:

 +1


 -- Lefty

 On Mon, Feb 9, 2015 at 1:52 PM, Vikram Dixit K vikram.di...@gmail.com
 wrote:

  Hi Folks,
 
  We seem to have quite a few projects going around and in the interest of
  time and the project as a whole, it seems good to have branch committers
  much like what is there in the Hadoop project. I am proposing an addition
  to the committer bylaws as follows (taken from the hadoop project bylaws
  http://hadoop.apache.org/bylaws.html):
 
  Significant, pervasive features are often developed in a speculative
  branch of the repository. The PMC may grant commit rights on the branch
 to
  its consistent contributors, while the initiative is active. Branch
  committers are responsible for shepherding their feature into an active
  release and do not cast binding votes or vetoes in the project.
 
  I am +1 on this.
 
  Thanks
  Vikram.
 




-- 
Nothing better than when appreciated for hard work.
-Mark

VOTE Bylaw for having branch committers in hive

2015-02-09 Thread Vikram Dixit K

Hi Folks,

We seem to have quite a few projects going around and in the interest of
time and the project as a whole, it seems good to have branch committers
much like what is there in the Hadoop project. I am proposing an addition
to the committer bylaws as follows ( taken from the hadoop project bylaws
http://hadoop.apache.org/bylaws.html )

Significant, pervasive features are often developed in a speculative
branch of the repository. The PMC may grant commit rights on the branch to
its consistent contributors, while the initiative is active. Branch
committers are responsible for shepherding their feature into an active
release and do not cast binding votes or vetoes in the project.

Actions: New Branch Committer
Description: When a new branch committer is proposed for the project.
Approval: Lazy Consensus
Binding Votes: Active PMC members
Minimum Length: 3 days
Mailing List: priv...@hive.apache.org

Actions: Removal of Branch Committer
Description: When a branch committer is removed from the project.
Approval: Consensus
Binding Votes: Active PMC members excluding the committer in question if
they are PMC members too.
Minimum Length: 6 days
Mailing List: priv...@hive.apache.org

This vote will run for 6 days. PMC members please vote.

Thanks
Vikram.

Re: Created branch 1.0

2015-02-09 Thread Vikram Dixit K

Hi Ed,

This was the case with 0.14. It was fixed before 1.0 went out in HIVE-8933.

Thanks
Vikram.

On Mon, Feb 9, 2015 at 9:08 AM, Alan Gates alanfga...@gmail.com wrote:

 That's fixed, correct?  I do not believe there were any SNAPSHOT
 dependencies in 1.0.

 Alan.

   Edward Capriolo edlinuxg...@gmail.com
  February 9, 2015 at 8:40
 Because we can not really have a stable api if by definition we build
 around snapshot dependencies.

 On Mon, Feb 9, 2015 at 11:38 AM, Edward Capriolo edlinuxg...@gmail.com
 edlinuxg...@gmail.com

   Edward Capriolo edlinuxg...@gmail.com
  February 9, 2015 at 8:38
 Question.

 https://issues.apache.org/jira/browse/HIVE-8614

 Did we not just agree in this thread that hive will no long have dependency
 that are SNAPSHOT?


   Brock Noland br...@cloudera.com
  January 22, 2015 at 22:06
 Hi Alan,

 I agree with Xuefu and what was suggested in your statement. I was
 thinking we'd release the next release as 0.15 and then later there
 would be 1.0 off trunk (e.g. what would have been 0.16) and thus be
 superset (minus anything we intentionally remove).

 As I have said several times, I'd like to release more often so I feel
 we could even start the 1.0 work shortly after the 0.15 release. For
 my part, I do agree with some earlier contributor/user sentiment that
 it would be good to have some basic public API defined for 1.0. I
 don't think that will be too hard as it's more or less obvious what
 our public API is today.

 Hope this seems reasonable.

 Cheers,
 Brock
   Xuefu Zhang xzh...@cloudera.com
  January 22, 2015 at 12:31
 Hi Thejas/Alan,

 From all the argument, I think there was an assumption that the proposed
 1.0 release will be imminent and 0.15 will happen far after that. Based on
 that assumption, 0.15 will become 1.1, which is greater in scope than 1.0.
 However, this assumption may not be true. The confusion will be significant
 if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.

 Another concern is that, the proposed release of 1.0 is a subset of of
 Hive's functionality, and for major releases users are expecting major
 improvement in functionality as well as stability. Mutating from 0.14.1
 release seems falling short in that expectation.

 Having said that, I'd think it makes more sense to release 0.15 as 0.15,
 and later we release 1.0 as the major release that supersedes any previous
 releases. That will fulfill the expectations of a major release.

 Thanks,
 Xuefu


   Alan Gates ga...@hortonworks.com
  January 22, 2015 at 12:12
  I had one clarifying question for Brock and Xuefu.  Was your proposal to
 still call the branch from trunk you are planning in a few days 0.15 (and
 hence release it as 0.15) and have 1.0 be a later release?  Or did you want
 to call what is now 0.15 1.0?  If you wanted 1.0 to be post 0.15, are you
 ok with stipulating that the next release from trunk after 0.15 (what would
 have been 0.16) is 1.0?

 Alan.




-- 
Nothing better than when appreciated for hard work.
-Mark

[ANNOUNCE] Apache Hive 1.0.0 Released

2015-02-04 Thread Vikram Dixit K

The Apache Hive team is proud to announce the the release of Apache
Hive version 1.0.0.

The Apache Hive (TM) data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:

* Tools to enable easy data extract/transform/load (ETL)

* A mechanism to impose structure on a variety of data formats

* Access to files stored either directly in Apache HDFS (TM) or in other
  data storage systems such as Apache HBase (TM)

* Query execution via Apache Hadoop MapReduce and Apache Tez frameworks.

For Hive release details and downloads, please
visit:https://hive.apache.org/downloads.html

Hive 1.0.0 Release Notes are available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12329278styleName=TextprojectId=12310843


We would like to thank the many contributors who made this release
possible.

Regards,

The Apache Hive Team

Re: [VOTE] Apache Hive 1.0 Release Candidate 2

2015-02-03 Thread Vikram Dixit K

With 3 +1s from the hive PMC, this vote passes. I will be publishing the
artifacts to the Apache page shortly.

On Sat, Jan 31, 2015 at 1:04 PM, Brock Noland br...@cloudera.com wrote:

 +1

 verified sigs, hashes, verified no SNAPSHOT deps, and ran some queries

 On Fri, Jan 30, 2015 at 7:48 PM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:

  +1
 
  Verified signatures, md5, ran queries from binary and src, compiled src
  with hadoop-1 and hadoop-2, verified for 1.0.0 version numbers
 everywhere,
  no snapshot deps
 
   On Jan 30, 2015, at 5:08 PM, Thejas Nair thejas.n...@gmail.com
 wrote:
  
   +1
  
   - verified signatures and checksum
   - built from source tar.gz
   - ran simple queries from both bin.tar.gz and newly built package
   - Verified RELEASE_NOTES.txt, checked LICENSE,NOTICE, README.txt
   - used schematool to upgrade metastore from hive 0.13.0 to 1.0.0
  
  
  
   On Thu, Jan 29, 2015 at 5:05 PM, Vikram Dixit K 
 vikram.di...@gmail.com
  wrote:
   Apache Hive 1.0 Release Candidate 2 is available here:
  
 
 http://cp.mcafee.com/d/k-Kr4x8idEI9FKeeecfCXCQrKfnvoppodETsd7b3zaaapJ6XzRTS6mnPqdT3hO_txVCVJohQJJyuMgzI0kjH6to6aNaQVsSjH6to6aNaQVsSMqei2tHcfZvCm7NPVEVWZOWqrz_e3D767KmKzp55l6X_axVZicHs3jq9JATvAXTLuZXTKrKr01Hvlo_-Rrr4_U03xF-cOaNRn2szfVGSS9-n9Oc-nhW_nbNIDxJ3P9ufPrz0KyCMY-qeiWMzFrr4Zwx7o74WNDm1yIiJen1hehD-Rrr4_U02rp79L6MnWhEwdbop3096ziWq811rr4_d40nApYQg8ZsQg0LP_SDCy0iS24AdDVEwjdIe6_9XrzV4T6C0X
  
   Maven artifacts are available here:
  
 
 http://cp.mcafee.com/d/k-Kr41EgdEI9FKeeecfCXCQrKfnvoppodETsd7b3zaaapJ6XzRTS6mnPqdT3hO_txVCVJohQJJyuMgzI0kjH6to6aNaQVsSjH6to6aNaQVsSMqei2tHcfZvCm7NPVEVWZOWqrz_e3D767KmKzp55l6X_axVZicHs3jqpJATvAXTLuZXTKrKr9PCJhbczWRqiDm9rJmSNf-00VqI9_2uhZqJ9jH6nQM03GSS9-jApYKztd73q7CiYvCT61t5dxVYQsBRx7iSS9X12eMe9RzeI35oBqsK2yszfZGSS9_M04SOejudwLQzh0qmMO60id6BQQg22SS9-q80L8PVEwhWVEw1vD_Jfd40BI498rfPh0CrosdWDFQxXE4
  
   Source tag for RC1 is at:
  
 
 http://cp.mcafee.com/d/avndzgOd6Qm4QT77767PtPqdT7HLIcII6QrK6zBxNB55cSztNWXX3bbVJ6XxEVvKMYPsSI8WmSNfo8hS0a9RzeI35oBqsKr9RzeI35oBqsKrod791eRC7-LPb3UVYQsZuVtddN_D1Pzz3TbnhIyyGzt_BgY-F6lK1FJASOrLOtXTLuZXTdTdw0W6otGSS9_M078-JmCrhT2szfVfidczV_XjOWfnWVudAYdEupbN-rso5QkS7DPhOnm4tbroDI48X0UDmcWMclylFOUa9Oc_SHroD_00jr8VdUS2_id41Fr38o18Qqnjh08broDVEw2YzfCy17HCy05-v-QYQg2mMgAxI_d42pJxMToJfj2RS
  
   Voting will conclude in 72 hours.
  
   Hive PMC Members: Please test and vote.
  
   Thanks
  
   Vikram.
  
  
   --
   Nothing better than when appreciated for hard work.
   -Mark
 
 




-- 
Nothing better than when appreciated for hard work.
-Mark

[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions


[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297959#comment-14297959
 ] 

Vikram Dixit K commented on HIVE-9436:
--

Committed to RC for 1.0.

 RetryingMetaStoreClient does not retry JDOExceptions
 

 Key: HIVE-9436
 URL: https://issues.apache.org/jira/browse/HIVE-9436
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.0.0, 1.2.0

 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch


 RetryingMetaStoreClient has a bug in the following bit of code:
 {code}
 } else if ((e.getCause() instanceof MetaException) 
 e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) {
   caughtException = (MetaException) e.getCause();
 } else {
   throw e.getCause();
 }
 {code}
 The bug here is that java String.matches matches the entire string to the 
 regex, and thus, that match will fail if the message contains anything before 
 or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
 should match (?s).\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called


[ 
https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297920#comment-14297920
 ] 

Vikram Dixit K commented on HIVE-9473:
--

+1 for 1.0.0

 sql std auth should disallow built-in udfs that allow any java methods to be 
 called
 ---

 Key: HIVE-9473
 URL: https://issues.apache.org/jira/browse/HIVE-9473
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-9473.1.patch


 As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java 
 methods. This should be disallowed when sql standard authorization is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions


 [ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9436:
-
Fix Version/s: 1.0.0

 RetryingMetaStoreClient does not retry JDOExceptions
 

 Key: HIVE-9436
 URL: https://issues.apache.org/jira/browse/HIVE-9436
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.0.0, 1.2.0

 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch


 RetryingMetaStoreClient has a bug in the following bit of code:
 {code}
 } else if ((e.getCause() instanceof MetaException) 
 e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) {
   caughtException = (MetaException) e.getCause();
 } else {
   throw e.getCause();
 }
 {code}
 The bug here is that java String.matches matches the entire string to the 
 regex, and thus, that match will fail if the message contains anything before 
 or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
 should match (?s).\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9514) schematool is broken in hive 1.0.0


[ 
https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297860#comment-14297860
 ] 

Vikram Dixit K commented on HIVE-9514:
--

+1 LGTM.

 schematool is broken in hive 1.0.0
 --

 Key: HIVE-9514
 URL: https://issues.apache.org/jira/browse/HIVE-9514
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.0.0

 Attachments: HIVE-9514.1.patch


 Schematool gives following error - 
 {code}
 bin/schematool -dbType derby -initSchema
 Starting metastore schema initialization to 1.0
 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified 
 for initialization: 1.0
 {code}
 Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for 
 new .sql files for 1.0.0. However, schematool needs to be made aware of the 
 metastore schema equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Apache Hive 1.0 Release Candidate 1

2015-01-29 Thread Vikram Dixit K

Hi Folks

With the issue that Thejas found with the schematool, I need to spin
another RC and cancel this vote. I am including Lefty's webhcat change and
the NOTICE and README.txt file changes as mentioned by Chao Sun as well.

Thanks
Vikram.

On Thu, Jan 29, 2015 at 4:09 PM, Prasanth Jayachandran
pjayachand...@hortonworks.com wrote:

+1.

Checked MD5, signatures, built source with hadoop-1 and 2 profiles, ran
some test queries, no snapshot deps.

On Jan 29, 2015, at 10:04 AM, Alan Gates ga...@hortonworks.commailto:
ga...@hortonworks.com wrote:

+1. Downloaded it, checked out the signatures, did a build, checked there
were no snapshot dependencies.

Alan.

Vikram Dixit Kmailto:vikram.di...@gmail.com
January 27, 2015 at 14:28
Apache Hive 1.0 Release Candidate 1 is available here:
http://people.apache.org/~vikram/hive/apache-hive-1.0-rc1/
http://cp.mcafee.com/d/5fHCNASyMCC--MYC-rKrhKUZtZxBBwSztMQsIecEEFCQrKfnvoppvdETsd7bZS7CrCRx7iSS9X12eM1heIpRwoH4HjBPpeIpRwoH4HjBPpEVIUeSP3_nVBAtRXBQSjhOqeuvd7bTbnhIyCGyyPOEuvkzaT0QSyrpdTVdByX2rXXapKVI06JZlz_XlJIj_w0e6DUP8H7ls9Oc_CHroDVsD8PVt7HBUShMSxVAL7VJNwnhjovhvbH2eBJIjS24twsjH6to6aNaQVs54V6vXlJIj_w09JVBwsr1vF6y0QJxAc0AqdbFEw45JIjYQg1uhDPh0zRPh02_f_quq81bo8igSvCy14SMUrbh5deRHlsi_T

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-1020/
http://cp.mcafee.com/d/k-Kr6hESyMCC--MYC-rKrhKUZtZxBBwSztMQsIecEEFCQrKfnvoppvdETsd7bZS7CrCRx7iSS9X12eM1heIpRwoH4HjBPpeIpRwoH4HjBPpEVIUeSP3_nVBAtRXBQSjhOqeuvd7bTbnhIyCGyyPOEuvkzaT0QSCrpdTVdByX2rXXapKVIDeqR4IOfHlFatoBKRrr4_U03BGMDY9V7RGQBeIpvj00eHroDVehDOWdQXUrgYOnzYSUMbEFIfELBRx7iSS9X12eMe9RzeI35oBqsK2yszfZGSS9_M04SYOMedwLQzh0qmMO60id6BQQg22SS9-q80L8PVEwhWVEw1vD_Jfd40BI498rfPh0yrosdNK_WKF-w

Source tag for RC1 is at:
http://svn.apache.org/repos/asf/hive/branches/branch-1.0/
http://cp.mcafee.com/d/2DRPoAcyhJ5xddZZxVdYTsSztNWXX3bb1J6XxEVosphhjdETsuK-MOO-rhKUqenXIfcTdH2eBJIjS24tw2ytoPH0Nm9mDbCOtoPH0Nm9mDbCPhPpMtJC7-LPb8XHTbFICzAQsY-qenKmKzp5dl55DBgY-F6lK1FJASOrLOrb5S4TTSkPtPo0exC7qJJyvY01OfHlFCQtMD8P-48X2NfWogzIb4OWfnMSxVAL7VJNwnhjovhvbH2eBJIjS24twsjH6to6aNaQVs54V6vXlJIj_w09JVBwsr1vF6y0QJxAc0AqdbFEw45JIjYQg1uhDPh0zRPh02_f_quq81bo8igSvCy14SMUruy0Jp7Dt1bg

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks

Vikram.

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

--
Nothing better than when appreciated for hard work.
-Mark

[jira] [Updated] (HIVE-8807) Obsolete default values in webhcat-default.xml


 [ 
https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-8807:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch 1.0

 Obsolete default values in webhcat-default.xml
 --

 Key: HIVE-8807
 URL: https://issues.apache.org/jira/browse/HIVE-8807
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Lefty Leverenz
Assignee: Eugene Koifman
 Fix For: 1.0.0

 Attachments: HIVE8807.patch


 The defaults for templeton.pig.path  templeton.hive.path are 0.11 in 
 webhcat-default.xml but they ought to match current release numbers.
 The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml).
 no precommit tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[VOTE] Apache Hive 1.0 Release Candidate 2

2015-01-29 Thread Vikram Dixit K

Apache Hive 1.0 Release Candidate 2 is available here:
http://people.apache.org/~vikram/hive/apache-hive-1.0.0-rc2/

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-1021/

Source tag for RC1 is at:
http://svn.apache.org/repos/asf/hive/tags/release-1.0.0-rc2/

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks

Vikram.


-- 
Nothing better than when appreciated for hard work.
-Mark

[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml

2015-01-28 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295829#comment-14295829
 ] 

Vikram Dixit K commented on HIVE-8807:
--

If I end up rolling out a new release and we have a patch for this by then, I 
will include this in the next roll-out.

Thanks
Vikram.

 Obsolete default values in webhcat-default.xml
 --

 Key: HIVE-8807
 URL: https://issues.apache.org/jira/browse/HIVE-8807
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Lefty Leverenz
 Fix For: 0.14.1


 The defaults for templeton.pig.path  templeton.hive.path are 0.11 in 
 webhcat-default.xml but they ought to match current release numbers.
 The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[VOTE] Apache Hive 1.0 Release Candidate 0

2015-01-27 Thread Vikram Dixit K

Apache Hive 1.0 Release Candidate 0 is available here:
http://people.apache.org/~vikram/hive/apache-hive-1.0-rc0/

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-1019/

Source tag for RC0 is at:
http://svn.apache.org/repos/asf/hive/branches/branch-1.0/

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks

Vikram.

-- 
Nothing better than when appreciated for hard work.
-Mark

[jira] [Updated] (HIVE-9038) Join tests fail on Tez


 [ 
https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9038:
-
Fix Version/s: 1.0.0

 Join tests fail on Tez
 --

 Key: HIVE-9038
 URL: https://issues.apache.org/jira/browse/HIVE-9038
 Project: Hive
  Issue Type: Bug
  Components: Tests, Tez
Reporter: Ashutosh Chauhan
Assignee: Vikram Dixit K
 Fix For: 1.0.0

 Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch


 Tez doesn't run all tests. But, if you run them, following tests fail with 
 runt time exception pointing to bugs.
 * {{auto_join21.q}}
 * {{auto_join29.q}}
 * {{auto_join30.q}}
 * {{auto_join_filters.q}}
 * {{auto_join_nulls.q}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9141) HiveOnTez: mix of union all, distinct, group by generates error


 [ 
https://issues.apache.org/jira/browse/HIVE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9141:
-
Fix Version/s: 1.0.0

 HiveOnTez: mix of union all, distinct, group by generates error
 ---

 Key: HIVE-9141
 URL: https://issues.apache.org/jira/browse/HIVE-9141
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.15.0
Reporter: Pengcheng Xiong
Assignee: Navis
 Fix For: 0.15.0, 1.0.0

 Attachments: HIVE-9141.1.patch.txt


 Here is the way to produce it:
 in Hive q test setting (with src table)
 set hive.execution.engine=tez;
 SELECT key, value FROM
   (
   SELECT key, value FROM src
 UNION ALL
   SELECT key, key as value FROM 
   
   (  
   SELECT distinct key FROM (
   SELECT key, value FROM
   (SELECT key, value FROM src
   UNION ALL
   SELECT key, value FROM src
   )t1 
   group by  key, value
   )t2
 )t3 
   
)t4
group by  key, value;
 will generate
 2014-12-16 23:19:13,593 ERROR ql.Driver (SessionState.java:printError(834)) - 
 FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.MapWork cannot be 
 cast to org.apache.hadoop.hive.ql.plan.ReduceWork
 java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.MapWork cannot 
 be cast to org.apache.hadoop.hive.ql.plan.ReduceWork
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:361)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:87)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103)
 at 
 org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:69)
 at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:368)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:202)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1155)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:206)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:158)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:369)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:304)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:834)
 at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:136)
 at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_uniontez2(TestMiniTezCliDriver.java:120)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Created branch 1.0

2015-01-27 Thread Vikram Dixit K

Hi Folks,

It has been a few days and I have done all the work needed to produce a 1.0
RC and think it is better to have a vote on it. I still hope that we can
have this release as 1.0 and Brock's release as 1.1. By the end of the day
I think having more releases is a good thing for the community as is moving
to 1.0 sooner rather than later.

Thanks
Vikram.

On Fri, Jan 23, 2015 at 6:12 PM, Sergey Shelukhin ser...@hortonworks.com
wrote:

 I think the way it is done in Hadoop space is better for Hadoop space (and
 better wrt consistency, us being in the Hadoop space).
 Because no single company or QA process controls or covers all the changes
 to the product, and some changes go unseen by every actor, stabilization
 period is a must...


 And anyway enterprise software on trunk model does not cut releases
 immediately off trunk and ship them. With enterprise software there's
 lengthy QA, with Hadoop there's lengthy cutting edge release.
 How about we cut 1.0 with stable version 0.14.1, and instead of 0.15 do
 2.0, like HBase did?
 We can maintain 1.0 as maintenance release; with 2.0 we can add new
 unstable stuff, and also remove all the old paths we don't care about (old
 Hadoop support, HiveCLI(?), old Java version support) etc.


 On Fri, Jan 23, 2015 at 11:40 AM, Szehon Ho sze...@cloudera.com wrote:

  Wherever I've seen in enterprise software, the trunk-based development
  model has been the standard where all release branches are cut from trunk
  and short-lived.  I've never heard of a case where a branch originally
  designated for 0.14 (minor release) is cut again to become 1.0 (major
  release), and I dont think if you ask anyone they will expect it either.
  There was also no announced plan when cutting 0.14 branch that it was
  eventually going to be 1.0.  As Brock pointed out in the beginning,
 Hadoop
  branch/versioning is the only exception and an anti-pattern, and all the
  confusion like why 0.xx has features not in 1.0 would not be there if it
  followed this.  I would really hate to see the same anti-pattern happen
 to
  Hive, so my vote is also against this.  Also this standard release
  branching practice has been in Hive throughout its history, you wouldn't
  make 0.14 out of 0.13 branch, would you?
 
  From the stability and long-term support use-cases that is very
 definitely
   the wrong thing to do - to cram code into a 1.0 release.
 
  Major release is supposed to be stable.
 
 
  I also don't see how cutting 1.0 from trunk precludes it from
 stabilizing.
  Also I don't think those arguments of 0.14 as most stable that can be
  backed up, what constitutes stability?  Bug fixes are just one part, in
  that case there are always more bug fixes in later Hive versions than
  earlier ones, so probably API stability is a more measure-able term and
  should be more important to consider.
 
  Thanks,
  Szehon
 
 
  On Fri, Jan 23, 2015 at 10:42 AM, Gopal V gop...@apache.org wrote:
 
   On 1/23/15, 6:59 AM, Xuefu Zhang wrote:
  
   While it's true that a release isn't going to include everything from
   trunk, proposed 1.0 release is branched off 0.14, which was again
  branched
   from trunk long time ago. If you compare the code base, you will see
 the
   huge difference.
  
  
   From the stability and long-term support use-cases that is very
  definitely
   the wrong thing to do - to cram code into a 1.0 release.
  
   The huge difference is *THE* really worrying red-flag.
  
   Or is the thought behind everything from trunk that 1.0 just a
 number?
  
0.14.1 in terms of functionality and stability will be much clearer,
   meeting the all expectations for a major release.
  
  
   Just to be clear, when hive-14 was released, it was actually a major
   release.
  
   That branch kicked off in Sept and has been updated since then with a
   known set of critical fixes, giving it pedigree and has already seen
   customer time.
  
   In all this discussion, it doesn't sound like you consider 0.15 to be a
   major release - that gives me no confidence in your approach.
  
   Cheers,
   Gopal
  
On Thu, Jan 22, 2015 at 3:08 PM, Thejas Nair the...@hortonworks.com
   wrote:
  
On Thu, Jan 22, 2015 at 12:31 PM, Xuefu Zhang xzh...@cloudera.com
   wrote:
Hi Thejas/Alan,
   
From all the argument, I think there was an assumption that the
   proposed
1.0 release will be imminent and 0.15 will happen far after that.
  Based
   on
that assumption, 0.15 will become 1.1, which is greater in scope
 than
   1.0.
However, this assumption may not be true. The confusion will be
   significant
if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0.
  
   Yes, the assumption is that 1.0 will be out very soon,  before 0.15
   line is ready, and that 0.15 can become 1.1 .
   Do you think that assumption won't hold true ? (In previous emails in
   this thread, I talk about reasons why this assumption is reliable).
   I agree that it does not make sense to release

[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result


 [ 
https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-9053:
-
Fix Version/s: 1.0.0

 select constant in union all followed by group by gives wrong result
 

 Key: HIVE-9053
 URL: https://issues.apache.org/jira/browse/HIVE-9053
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 0.15.0, 0.14.1, 1.0.0

 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, 
 HIVE-9053.03.patch, HIVE-9053.04.patch, HIVE-9053.patch-branch-1.0


 Here is the the way to reproduce with q test:
 select key from (select '1' as key from src union all select key from src)tab 
 group by key;
 will give
 OK
 NULL
 1
 This is not correct as src contains many other keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result