from:"Alan Gates"

A question on how to mark Hive for HCatalog releases

2011-06-21 Thread Alan Gates


All,

I have a question of how we should manage marking Hive source code for  
HCatalog releases.  For HCatalog 0.1 we used the the 0.7 version of  
Hive.  But we will get to HCatalog 0.2 before the Hive community gets  
to 0.8.  We have added features to Hive since 0.7 that we need in  
HCatalog 0.2.   In general, it is not reasonable to assume that Hive  
and HCatalog releases will line up such that HCatalog can always  
depend on a released version of Hive.


So how should we mark the proper version of Hive code for an HCatalog  
release?  The only thing that comes to mind is tagging a particular  
revision, with the option to branch at that revision if necessary.  We  
would only need a branch if something got checked in post tag that  
HCatalog needed or wanted, but there were other intervening check ins  
we did not want.  In that case we would need to ask you to branch and  
port the needed change(s).


Are you okay with this approach?  Are there other approaches you would  
suggest or prefer?


Alan.

Re: A question on how to mark Hive for HCatalog releases

2011-06-25 Thread Alan Gates

Alright, qui tacit consentit (except for John, who explicitly  
agreed).  We'll go with this plan.


Alan.

On Jun 22, 2011, at 12:46 PM, John Sichi wrote:


Sounds good to me.

JVS

On Jun 21, 2011, at 4:21 PM, Alan Gates wrote:


All,

I have a question of how we should manage marking Hive source code  
for HCatalog releases.  For HCatalog 0.1 we used the the 0.7  
version of Hive.  But we will get to HCatalog 0.2 before the Hive  
community gets to 0.8.  We have added features to Hive since 0.7  
that we need in HCatalog 0.2.   In general, it is not reasonable to  
assume that Hive and HCatalog releases will line up such that  
HCatalog can always depend on a released version of Hive.


So how should we mark the proper version of Hive code for an  
HCatalog release?  The only thing that comes to mind is tagging a  
particular revision, with the option to branch at that revision if  
necessary.  We would only need a branch if something got checked in  
post tag that HCatalog needed or wanted, but there were other  
intervening check ins we did not want.  In that case we would need  
to ask you to branch and port the needed change(s).


Are you okay with this approach?  Are there other approaches you  
would suggest or prefer?


Alan.

Re: [VOTE] Apache Hive 0.9.0 Release Candidate 2

2012-04-25 Thread Alan Gates

+1.  Ran through our end-to-end test framework (see 
https://issues.apache.org/jira/browse/HIVE-2670), results look good.

Alan.

On Apr 24, 2012, at 2:25 PM, Ashutosh Chauhan wrote:

 Downloaded the bits. Installed on 5 node cluster.
 Did create table. Ran basic queries. Ran unit tests. All looks good.
 
 +1
 
 Thanks,
 Ashutosh
 
 On Tue, Apr 24, 2012 at 12:29, Ashutosh Chauhan hashut...@apache.orgwrote:
 
 Hey all,
 
 Apache Hive 0.9.0 Release Candidate 2 is available here:
 http://people.apache.org/~hashutosh/hive-0.9.0-rc2/
 
 Maven artifacts are available here:
 https://repository.apache.org/content/repositories/orgapachehive-094/
 
 Change List is available here:
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742
 
 Voting will conclude in 72 hours.
 Hive PMC Members: Please test and vote.
 
 Thanks,
 Ashutosh

Re: Turn around on patches that do not need full unit testing

2012-06-15 Thread Alan Gates

One approach I've seen other projects take is to have an ant test-commit 
target that users are responsible to run before committing.  This is a short 
(15 min or less) target that runs all true unit tests (tests that exercise just 
a class or two in isolation) and a couple of functional tests that exercise 
major functionality but not every last thing.  The full test suite can then be 
run nightly and any issues addressed.

Alan.

On Jun 11, 2012, at 6:17 AM, Edward Capriolo wrote:

 I agree. Having a short-test and long-test might make more sense. IE
 long-test includes funky serdes and UDFs.
 
 As for In the meanwhile, check in without test may introduce bug
 which can break production cluster.costly. the solution is not to run
 trunk. Run only releases.
 
 All the tests are run by jenkins post commit so we know when trunk is
 broken and we should not cut a release if all the tests are not
 passing. Also we should not knowingly break the build or leave it
 broken. IE would should strive to have all tests passing on trunk at
 all times, but not committing a typo patch for fear that the build
 might break does not make much sense. We can easily revert things in
 such a case.
 
 Edward
 
 On Sun, Jun 10, 2012 at 11:14 PM, Gang Liu g...@fb.com wrote:
 Yeah it is frustrated to take a long time to turn around for a tiny change. 
 It is understood.
 
 In the meanwhile, check in without test may introduce bug which can break 
 production cluster.costly.
 
 I think the problem is not if we should run test but running tests takes 
 long time. If it takes reasonable time like 30 minutes, we have less pain.
 
 In a summary let us keep high quality via running test for every commit. 
 Target to make unit test fast.
 
 Btw we can run test in parallel a hive wiki has details
 
 Thanks
 
 Sent from my iPhone
 
 On Jun 10, 2012, at 7:29 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 
 Hive's unit tests take a long time. There are many simple patches we
 can get into hive earlier if we drop the notion of running the full
 test suite to QA every patch. For example:
 
 https://issues.apache.org/jira/browse/HIVE-3081  -- spelling mistakes
 that involved types
 
 https://issues.apache.org/jira/browse/HIVE-3061  -- patches with code 
 cleanup
 
 https://issues.apache.org/jira/browse/HIVE-3048  -- patches that are
 one or two lines of code
 
 https://issues.apache.org/jira/browse/HIVE-2288  -- patches that are
 only additive
 
 Also I do not believe we should kick a patch back to someone for every
 tiny change. For example, suppose someone commits 9000 lines of code,
 with one typo. I have seen similar situations where the status gets
 reverted back to OPEN. It takes the person working on it a day to get
 back into the patch again, then by the time someone comes back around
 to reviewing another 3 days might go by.
 
 This is similar to a situation in the supermarket where  You can only
 use one coupon so people walk in and out of the store 6 times to buy
 6 items. Procedure and rules are followed, end results is really the
 same, but 6 times the work.
 
 In this case the committer should just make he change, re upload the
 patch and say 'committed with typo fixed' and commit.
 
 please comment,
 
 Edward

Fwd: DesignLounge @ HadoopSummit

2013-06-13 Thread Alan Gates

Begin forwarded message:

 From: Eric Baldeschwieler eri...@hortonworks.com
 Date: June 11, 2013 10:46:25 AM PDT
 To: common-...@hadoop.apache.org common-...@hadoop.apache.org
 Subject: DesignLounge @ HadoopSummit
 Reply-To: common-...@hadoop.apache.org

 Hi Folks,

 We thought we'd try something new at Hadoop Summit this year to build upon 
 two pieces of feedback I've heard a lot this year:

 Apache project developers would like to take advantage of the Hadoop summit 
 to meet with their peers to on work on specific technical details of their 
 projects
 That they want to do this during the summit, not before it starts or at 
 night. I've been told BoFs and other such traditional formats have not 
 historically worked for them, because they end up being about educating users 
 about their projects, not actually working with their peers on how to make 
 their projects better.
 So we are creating a space in the summit - marked in the event guide as 
 DesignLounge - concurrent with the presentation tracks where Apache Project 
 contributors can meet with their peers to plan the future of their project or 
 work through various technical issues near and dear to their hearts.

 We're going to provide white boards and message boards and let folks take it 
 from there in an unconference style.  We think there will be room for about 4 
 groups to meet at once.  Interested? Let me know what you think.  Send me any 
 ideas for how we can make this work best for you.

 The room will be 231A and B at the Hadoop Summit and will run from 10:30am to 
 5:00pm on Day 1 (26th June), and we can also run from 10:30am to 5:00pm on 
 Day 2 (27th June) if we have a lot of topics that folk want to cover.

 Some of the early topics some folks told me they hope can be covered:

 Hadoop Core security proposals.  There are a couple of detailed proposals 
 circulating.  Let's get together and hash out the differences.
 Accumulo 1.6 features
 The Hive vectorization project.  Discussion of the design and how to phase it 
 in incrementally with minimum complexity.
 Finishing Yarn - what things need to get done NOW to make Yarn more effective
 If you are a project lead for one of the Apache projects, look at the 
 schedule below and suggest a few slots when you think it would be best for 
 your project to meet.  I'll try to work out a schedule where no more than 2 
 projects are using the lounge at once.  

 Day 1, 26th June: 10:30am - 12:30pm, 1:45pm - 3:30pm, 3:45pm - 5:00pm

 Day 2, 27th June: 10:30am - 12:30pm, 1:45pm - 3:30pm, 3:45pm - 5:00pm

 It will be up to you, the hadoop contributors, from there.

 Look forward to seeing you all at the summit,

 E14

 PS Please forward to the other -dev lists.  This event is for folks on the 
 -dev lists.

Fwd: DesignLounge @ HadoopSummit

2013-06-24 Thread Alan Gates

Begin forwarded message:

 From: Eric Baldeschwieler eri...@hortonworks.com
 Date: June 23, 2013 9:32:12 PM PDT
 To: common-...@hadoop.apache.org common-...@hadoop.apache.org, 
 mapreduce-...@hadoop.apache.org mapreduce-...@hadoop.apache.org, 
 hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org
 Subject: DesignLounge @ HadoopSummit
 Reply-To: common-...@hadoop.apache.org

 Hi Folks,

 I've integrated the feedback I've gotten on the design lounge.  A couple of 
 clarifications:

 1) The space will be open both days of the formal summit.  Apache Committers 
 / contributors are invited to stop by any time and use the space to meet / 
 network any time during the show.

 2) Below I've listed the times that various project members have suggested 
 they will be present to talk with others contributors about their project.  
 If we get a big showing for any of these slots we'll encourage folks to do 
 the unconference thing: Select a set of topics they want to talk about and 
 break up into groups to do so.

 3) This is an experiment.  Our goal is to make the summit as useful as 
 possible to the folks who build the Apache projects in the Apache Hadoop 
 stack.  Please let me know how it works for you and ideas for making this 
 even more effective.

 Committed times so far, with topic champion (Note - I've adjusted suggested 
 times to fit with the program a bit more smoothly):

 Wednesday
 11-1 - Hive - Ashutosh - The stinger initiative and other Hive activities
 2 - 4 - Security breakout - Kevin Minder - HSSO, Knox, Rhino
 3 - 4 - Frameworks to run services like HBase on Yarn - Weave, Hoya … - 
 Devaraj Das
 4 - 5 - Accumulo - Billie Rinaldi

 Thursday
 11-1 - Finishing Yarn - Arun Murthy - Near term improvements needed
 2 - 4 - HDFS - Suresh  Sanjay
 4 - 5 - Getting involved in Apache - Billie Rinaldi

 See you all soon!

 E14

 PS Please forward to other Apache -dev lists and CC me.  Thanks!

 On Jun 11, 2013, at 10:42 AM, Eric Baldeschwieler eri...@hortonworks.com 
 wrote:

 Hi Folks,

 We thought we'd try something new at Hadoop Summit this year to build upon 
 two pieces of feedback I've heard a lot this year:

  • Apache project developers would like to take advantage of the Hadoop 
 summit to meet with their peers to on work on specific technical details of 
 their projects
  • That they want to do this during the summit, not before it starts or 
 at night. I've been told BoFs and other such traditional formats have not 
 historically worked for them, because they end up being about educating 
 users about their projects, not actually working with their peers on how to 
 make their projects better.
 So we are creating a space in the summit - marked in the event guide as 
 DesignLounge - concurrent with the presentation tracks where Apache Project 
 contributors can meet with their peers to plan the future of their project 
 or work through various technical issues near and dear to their hearts.

 We're going to provide white boards and message boards and let folks take it 
 from there in an unconference style.  We think there will be room for about 
 4 groups to meet at once.  Interested? Let me know what you think.  Send me 
 any ideas for how we can make this work best for you.

 The room will be 231A and B at the Hadoop Summit and will run from 10:30am 
 to 5:00pm on Day 1 (26th June), and we can also run from 10:30am to 5:00pm 
 on Day 2 (27th June) if we have a lot of topics that folk want to cover.

 Some of the early topics some folks told me they hope can be covered:

  • Hadoop Core security proposals.  There are a couple of detailed 
 proposals circulating.  Let's get together and hash out the differences.
  • Accumulo 1.6 features
  • The Hive vectorization project.  Discussion of the design and how to 
 phase it in incrementally with minimum complexity.
  • Finishing Yarn - what things need to get done NOW to make Yarn more 
 effective
 If you are a project lead for one of the Apache projects, look at the 
 schedule below and suggest a few slots when you think it would be best for 
 your project to meet.  I'll try to work out a schedule where no more than 2 
 projects are using the lounge at once.  

 Day 1, 26th June: 10:30am - 12:30pm, 1:45pm - 3:30pm, 3:45pm - 5:00pm

 Day 2, 27th June: 10:30am - 12:30pm, 1:45pm - 3:30pm, 3:45pm - 5:00pm

 It will be up to you, the hadoop contributors, from there.

 Look forward to seeing you all at the summit,

 E14

 PS Please forward to the other -dev lists.  This event is for folks on the 
 -dev lists.

Re: call it Hive-SQL instead of HiveQL ?

2013-07-03 Thread Alan Gates

I'm +1 for calling it Hive SQL.  No one knows what HQL is when they see the 
initials.  Hive Query Language? Hadoop Query Language? Harold's Query Language? 
 I agree with Ed that we should be up front about what Hive is and isn't and 
about where it's going and where it isn't.  Whenever people ask me if being 
fully SQL-92 or SQL-2003 compliant or whatever is a goal I always say no.  
There's stuff in those specs Hive probably will never do.  But to me that 
doesn't mean it isn't SQL.  

Apache Derby calls its access language SQL.  Yet it doesn't support outer joins 
or tiny int or a number of other things Hive does.  SQLite calls its access 
language SQL and it has similar restrictions.  

People understand that every data store has different dialect of SQL.  Hive's 
dialect is particularly crude in some respects (lacking some standard features 
and datatypes) and doing anything real requires concepts not known in other SQL 
dialects (like what SerDe do you want your table to use).  Some of these we can 
address and some are a part of being on Hadoop.

One final analogy.  When a child is learning a language they 1) don't know as 
many words as an adult does and often don't understand adult usage even when 
they know all the words; and 2) use made up/nonsense words.  Yet no one says 
the child doesn't speak the language or speaks a different language.  You just 
recognize that the child is growing and learning the language.  How is Hive 
different?  It is growing and adding more parts of SQL all the time.

Alan.

On Jul 3, 2013, at 12:26 AM, Thejas Nair wrote:

 On Tue, Jul 2, 2013 at 8:39 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 What is in a name? :)
 
 Which SQL feature you are talking about here, that forces single reducer
 and hence should not be supported?
 
 Joining on anything besides = comes to mind
 
 Pretty sure the query mentioned here will not work (without being
 re-written)
 http://en.wikipedia.org/wiki/SQL
 
 SELECT isbn, title, price
 FROM Book
 WHERE price  (SELECT AVG(price) FROM Book)
 ORDER BY title;
 
 Don't you think hive should be supporting this ? Don't you think our
 users would want this ?
 
 You can do theta joins without using single reducer (cartesian product
 can be done in parallel). But that is besides the point. I don't
 expect hive to be 100% sql compliant. I don't see 100% sql compliance
 as a goal, but I see more SQL compliance as desirable.
 That is why I prefer the term Hive-SQL.
 
 Hive-SQL looks like it is trying to convey the idea that hive supports
 extensions like T-SQL http://en.wikipedia.org/wiki/Transact-SQL or PL/SQL.
 http://www.oracle.com/technetwork/database/features/plsql/index.html.
 
 If I refert to something as Oracle-SQL or DB2-SQL, I think people
 understand that it is a Oracle or DB2 dialect of SQL that I refer to.
 
 Lessons from my mother.
 
 You can't be half a saint.
 
 considering how much other databases deviate from the standard -
 http://troels.arvin.dk/db/rdbms/ . See how much deviation is there for
 example in  'limit clause' or the data types supported (and details of
 data type support) -
 If all your friends jumped off a bridge would you do it?
 
 My friends are very smart, if they jump of the bridge, there is
 probably a very good reason to do so, and I would seriously consider
 it.
 I think hive has many smart friends like DB2, Oracle, teradata,
 vertica, impala, and even phoenix
 (https://github.com/forcedotcom/phoenix).
 As you can see there is a wide range in SQL compliance across
 products. I don't see anything wrong in saying that hive is SQL on
 hadoop.
 
 I think I have conveyed everything I wanted to say on this topic.   I
 will stop and listen to what others think  before we go from half
 saints and jumping over the bridge to Hitler :)
 (http://en.wikipedia.org/wiki/Godwin's_law) (there I said it!!)
 
 I am looking forward to hearing if anybody else thinks calling it
 Hive-SQL  will make them confuse it for something like  PL/SQL. Also
 want to know if others think calling it HiveQL gives more clarity
 about it aiming to be SQL on hadoop
 
 Thanks,
 Thejas

Re: Tez branch and tez based patches

2013-07-15 Thread Alan Gates


On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote:

 I have started to see several re factoring patches around tez.
 https://issues.apache.org/jira/browse/HIVE-4843
 
 This is the only mention on the hive list I can find with tez:
 Makes sense. I will create the branch soon.
 
 Thanks,
 Ashutosh
 
 
 On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner 
 ghagleit...@hortonworks.com wrote:
 
 Hi,
 
 I am starting to work on integrating Tez into Hive (see HIVE-4660, design
 doc has already been uploaded - any feedback will be much appreciated).
 This will be a fair amount of work that will take time to stabilize/test.
 I'd like to propose creating a branch in order to be able to do this
 incrementally and collaboratively. In order to progress rapidly with this,
 I would also like to go commit-then-review.
 
 Thanks,
 Gunther.
 
 
 These refactor-ings are largely destructive to a number of bugs and
 language improvements in hive.The language improvements and bug fixes that
 have been sitting in Jira for quite some time now marked patch-available
 and are waiting for review.
 
 There are a few things I want to point out:
 1) Normally we create design docs in out wiki (which it is not)
 2) Normally when the change is significantly complex we get multiple
 committers to comment on it (which we did not)
 On point 2 no one -1  the branch, but this is really something that should
 have required a +1 from 3 committers.

The Hive bylaws,  https://cwiki.apache.org/confluence/display/Hive/Bylaws , lay 
out what votes are needed for what.  I don't see anything there about needing 3 
+1s for a branch.  Branching would seem to fall under code change, which 
requires one vote and a minimum length of 1 day.

 
 I for one am not completely sold on Tez.
 http://incubator.apache.org/projects/tez.html.
 directed-acyclic-graph of tasks for processing data this description
 sounds like many things which have never become popular. One to think of is
 oozie Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of
 actions.. I am sure I can find a number of libraries/frameworks that make
 this same claim. In general I do not feel like we have done our homework
 and pre-requisites to justify all this work. If we have done the homework,
 I am sure that it has not been communicated and accepted by hive developers
 at large.

A request for better documentation on Tez and a project road map seems totally 
reasonable.

 
 If we have a branch, why are we also committing on trunk? Scanning through
 the tez doc the only language I keep finding language like minimal changes
 to the planner yet, there is ALREADY lots of large changes going on!
 
 Really none of the above would bother me accept for the fact that these
 minimal changes are causing many patch available ready-for-review bugs
 and core hive features to need to be re based.
 
 I am sure I have mentioned this before, but I have to spend 12+ hours to
 test a single patch on my laptop. A few days ago I was testing a new core
 hive feature. After all the tests passed and before I was able to commit,
 someone unleashed a tez patch on trunk which caused the thing I was testing
 for 12 hours to need to be rebased.
 
 
 I'm not cool with this.Next time that happens to me I will seriously
 consider reverting the patch. Bug fixes and new hive features are more
 important to me then integrating with incubator projects.

(With my Apache member hat on)  Reverting patches that aren't breaking the 
build is considered very bad form in Apache.  It does make sense to request 
that when people are going to commit a patch that will break many other patches 
they first give a few hours of notice so people can say something if they're 
about to commit another patch and avoid your fate of needing to rerun the 
tests.  The other thing is we need to get get the automated build of patches 
working on Hive so committers are forced to run all of the tests themselves.  
We are working on it, but we're not there yet.

Alan.

Re: Tez branch and tez based patches

2013-07-16 Thread Alan Gates

Ed,

I'm not sure I understand your argument, so I'm going to try to restate it.  
Please tell me if I understand it correctly.

I think you're saying we should not embark on big projects in Hive because:
1) There were big projects in the past that were abandoned or are not currently 
making progress (such as Oracle integration, Hive StorageHandler)
2) There are other big projects going on (ORC, Vectorization)
3) There are lots of out standing patches that need to be dealt with.

I would respond with two points to this.

First, I agree that the large out standing patch count is very bad.  It keeps 
people from getting involved in Hive.  It deprives Hive of fixes and 
improvements it would otherwise have.  Several of the committers are working to 
address this by checking in peoples' patches, but they are unable to keep up.  
The best solution is to encourage other committers to check in patches as well 
and to find willing and able contributors and mentor them to committership as 
quickly as possible.

Second, the way Apache works is that contributors scratch the itch that bothers 
them. So to argue We shouldn't do X because we never finished Y or We 
shouldn't do X because we're doing Y (where X and Y are independent) is not 
valid in Apache projects.  It's fine to argue that Tez hasn't been adequately 
explained (I think you hinted at this in previous emails) and ask for 
clarifications on what it is and what the planned changes are.  If after a full 
explanation you think it's a bad idea it's fine to argue Tez is the wrong 
direction for Hive and try to convince the rest of the community.  But assuming 
the community accepts that Tez is a reasonable direction and there are 
volunteers who want to do the work, then you can't argue they should work on 
something else instead.

Alan.

On Jul 15, 2013, at 6:51 PM, Edward Capriolo wrote:

 The Hive bylaws,  https://cwiki.apache.org/confluence/display/Hive/Bylaws, 
 lay out what votes are needed for what.  I don't see anything there about
 needing 3 +1s for a branch.  Branching would seem to fall under code
 change, which requires one vote and a minimum length of 1 day.
 
 You could argue that all you need is one +1 to create a branch, but this is
 more then a branch. If you are talking about something that is:
 1) going to cause major re-factoring of critical pieces of hive like
 ExecDriver and MapRedTask
 2) going to be very disruptive to the efforts of other committers
 3) something that may be a major architectural change
 
 Getting the project on board with the idea is a good idea.
 
 Now I want to point something out. Here are some recent initiatives in hive:
 
 1) At one point there was a big initiative to support oracle after the
 initial work, there are patches in Jira no one seems to care about oracle
 support.
 2) Another such decisions was this support windows one, there are
 probably 4 windows patches waiting reviews.
 3) I still have no clue what the official hadoop1 hadoop2, hadoop 0.23
 support prospective is, but every couple weeks we get another jira about
 something not working/testing on one of those versions, seems like several
 builds are broken.
 4) Hive-storage handler, after the initial implementation no one cares to
 review any other storage handler implementation, 3 patches there or more,
 could not even find anyone willing to review the cassandra storage handler
 I spent months on.
 5) OCR, Vectorization
 6) Windowing: committed, numerous check-style violations.
 
 We have !!!160+!!! PATCH_AVAILABLE Jira issues. Few active committers. We
 are spread very thin, and embarking on another side project not involved
 with core hive seems like the wrong direction at the moment.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 On Mon, Jul 15, 2013 at 8:37 PM, Alan Gates ga...@hortonworks.com wrote:
 
 
 On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote:
 
 I have started to see several re factoring patches around tez.
 https://issues.apache.org/jira/browse/HIVE-4843
 
 This is the only mention on the hive list I can find with tez:
 Makes sense. I will create the branch soon.
 
 Thanks,
 Ashutosh
 
 
 On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner 
 ghagleit...@hortonworks.com wrote:
 
 Hi,
 
 I am starting to work on integrating Tez into Hive (see HIVE-4660,
 design
 doc has already been uploaded - any feedback will be much appreciated).
 This will be a fair amount of work that will take time to
 stabilize/test.
 I'd like to propose creating a branch in order to be able to do this
 incrementally and collaboratively. In order to progress rapidly with
 this,
 I would also like to go commit-then-review.
 
 Thanks,
 Gunther.
 
 
 These refactor-ings are largely destructive to a number of bugs and
 language improvements in hive.The language improvements and bug fixes
 that
 have been sitting in Jira for quite some time now marked patch-available
 and are waiting for review.
 
 There are a few things I want to point out:
 1) Normally we create design

Re: Tez branch and tez based patches

2013-07-17 Thread Alan Gates

 are talking about something that is:
 1) going to cause major re-factoring of critical pieces of hive like
 ExecDriver and MapRedTask
 2) going to be very disruptive to the efforts of other committers
 3) something that may be a major architectural change
 
 Getting the project on board with the idea is a good idea.
 
 Now I want to point something out. Here are some recent initiatives in
 hive:
 
 1) At one point there was a big initiative to support oracle after the
 initial work, there are patches in Jira no one seems to care about oracle
 support.
 2) Another such decisions was this support windows one, there are
 probably 4 windows patches waiting reviews.
 3) I still have no clue what the official hadoop1 hadoop2, hadoop 0.23
 support prospective is, but every couple weeks we get another jira about
 something not working/testing on one of those versions, seems like several
 builds are broken.
 4) Hive-storage handler, after the initial implementation no one cares to
 review any other storage handler implementation, 3 patches there or more,
 could not even find anyone willing to review the cassandra storage handler
 I spent months on.
 5) OCR, Vectorization
 6) Windowing: committed, numerous check-style violations.
 
 We have !!!160+!!! PATCH_AVAILABLE Jira issues. Few active committers. We
 are spread very thin, and embarking on another side project not involved
 with core hive seems like the wrong direction at the moment.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 On Mon, Jul 15, 2013 at 8:37 PM, Alan Gates ga...@hortonworks.com wrote:
 
 
 On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote:
 
 I have started to see several re factoring patches around tez.
 https://issues.apache.org/jira/browse/HIVE-4843
 
 This is the only mention on the hive list I can find with tez:
 Makes sense. I will create the branch soon.
 
 Thanks,
 Ashutosh
 
 
 On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner 
 ghagleit...@hortonworks.com wrote:
 
 Hi,
 
 I am starting to work on integrating Tez into Hive (see HIVE-4660,
 design
 doc has already been uploaded - any feedback will be much appreciated).
 This will be a fair amount of work that will take time to
 stabilize/test.
 I'd like to propose creating a branch in order to be able to do this
 incrementally and collaboratively. In order to progress rapidly with
 this,
 I would also like to go commit-then-review.
 
 Thanks,
 Gunther.
 
 
 These refactor-ings are largely destructive to a number of bugs and
 language improvements in hive.The language improvements and bug fixes
 that
 have been sitting in Jira for quite some time now marked patch-available
 and are waiting for review.
 
 There are a few things I want to point out:
 1) Normally we create design docs in out wiki (which it is not)
 2) Normally when the change is significantly complex we get multiple
 committers to comment on it (which we did not)
 On point 2 no one -1  the branch, but this is really something that
 should
 have required a +1 from 3 committers.
 
 The Hive bylaws,  https://cwiki.apache.org/confluence/display/Hive/Bylaws, 
 lay out what votes are needed for what.  I don't see anything there about
 needing 3 +1s for a branch.  Branching would seem to fall under code
 change, which requires one vote and a minimum length of 1 day.
 
 
 I for one am not completely sold on Tez.
 http://incubator.apache.org/projects/tez.html.
 directed-acyclic-graph of tasks for processing data this description
 sounds like many things which have never become popular. One to think
 of is
 oozie Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of
 actions.. I am sure I can find a number of libraries/frameworks that
 make
 this same claim. In general I do not feel like we have done our homework
 and pre-requisites to justify all this work. If we have done the
 homework,
 I am sure that it has not been communicated and accepted by hive
 developers
 at large.
 
 A request for better documentation on Tez and a project road map seems
 totally reasonable.
 
 
 If we have a branch, why are we also committing on trunk? Scanning
 through
 the tez doc the only language I keep finding language like minimal
 changes
 to the planner yet, there is ALREADY lots of large changes going on!
 
 Really none of the above would bother me accept for the fact that these
 minimal changes are causing many patch available ready-for-review
 bugs
 and core hive features to need to be re based.
 
 I am sure I have mentioned this before, but I have to spend 12+ hours to
 test a single patch on my laptop. A few days ago I was testing a new
 core
 hive feature. After all the tests passed and before I was able to
 commit,
 someone unleashed a tez patch on trunk which caused the thing I was
 testing
 for 12 hours to need to be rebased.
 
 
 I'm not cool with this.Next time that happens to me I will seriously
 consider reverting the patch. Bug fixes and new hive features are more
 important to me then integrating with incubator projects.
 
 (With my Apache member

Re: Tez branch and tez based patches

2013-07-17 Thread Alan Gates


On Jul 17, 2013, at 1:41 PM, Edward Capriolo wrote:

 
 In my opinion we should limit the amount of tez related optimizations to
 and trunk Refactoring that cleans up code is good, but as you have pointed
 out there wont be a tez release until sometime this fall, and this branch
 will be open for an extended period of time. Thus code cleanups and other
 tez related refactoring does not need to be disruptive to trunk.

I agree with this, though I suspect people will end up arguing about the 
meaning of code cleanup and disruptive.  In my discussions with Gunther he 
said he was doing code cleanup and it was not disruptive.  You obviously 
disagreed.  I've already suggested that any future patches that break lots of 
others should have their checkin preceded by a few hours notice that the patch 
will break things so others can say something if they are about to check in 
too.  I'd also be interested to hear from Gunther how much more general cleanup 
he feels is necessary on trunk.

 
 I have another relevant question, which I already probably know the answer
 to, but I will ask it anyway.
 
 Because tez is a YARN application, does this mean that Tez will be the
 first hive feature that will require YARN? (It seems like the answer is yes)

Yes, it will only work in the Hadoop 2.x world.  So obviously all this work 
needs to be done in a way that still allows Hive to use the MR execution engine 
in the Hadoop 1.x world.

Alan.

Re: HIVE-4266 - Refactor HCatalog code to org.apache.hive.hcatalog

2013-07-24 Thread Alan Gates

It won't be committed Friday.  The patch will need to sit a few days for people 
to look over it.  I doubt it will be committed before 8/5.

Alan.

On Jul 24, 2013, at 10:37 AM, Eugene Koifman wrote:

 I'm hoping to have the patch ready Friday, not sure if it will get
 committed the same day or not.
 
 On Wed, Jul 24, 2013 at 7:27 AM, Brock Noland br...@cloudera.com wrote:
 Hi,
 
 What day do you plan on doing this?  I am working on a change which I don't
 believe will be ready this week.
 
 Cheers!
 Brock
 
 
 On Tue, Jul 23, 2013 at 1:09 PM, Eugene Koifman 
 ekoif...@hortonworks.comwrote:
 
 I'm planning to change the package name of all hcatalog classes
 sometime this week (as was promised for 0.12).
 This is likely to affect any outstanding hcatalog patches on trunk.
 Please try to have them checked in as soon as possible.
 
 Thanks,
 Eugene
 
 
 
 
 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: [Discuss] project chop up

2013-07-26 Thread Alan Gates

I'm not sure how this is different from what hcat does today.  It needs Hive's 
jars to compile, so it's one of the last things in the compile step.  Would 
moving the other modules you note to be in the same category be enough?  Did 
you want to also make it so that the default ant target doesn't compile those?

Alan.

On Jul 26, 2013, at 4:09 PM, Edward Capriolo wrote:

 My mistake on saying hcat was a fork metastore. I had a brain fart for a
 moment.
 
 One way we could do this is create a folder called downstream. In our
 release step we can execute the downstream builds and then copy the files
 we need back. So nothing downstream will be on the classpath of the main
 project.
 
 This could help us breakup ql as well. Things like exotic file formats ,
 and things that are pluggable like zk locking can go here. That might be
 overkill.
 
 For now we can focus on building downstream and hivethrift1might be the
 first thing to try to downstream.
 
 
 On Friday, July 26, 2013, Thejas Nair the...@hortonworks.com wrote:
 +1 to the idea of making the build of core hive and other downstream
 components independent.
 
 bq.  I was under the impression that Hcat and hive-metastore was
 supposed to merge up somehow.
 
 The metastore code was never forked. Hcat was just using
 hive-metastore and making the metadata available to rest of hadoop
 (pig, java MR..).
 A lot of the changes that were driven by hcat goals were being made in
 hive-metastore. You can think of hcat as set of libraries that let pig
 and java MR use hive metastore. Since hcat is closely tied to
 hive-metastore, it makes sense to have them in same project.
 
 
 On Fri, Jul 26, 2013 at 6:33 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
 Also i believe hcatalog web can fall into the same designation.
 
 Question , hcatalog was initily a big hive-metastore fork. I was under
 the
 impression that Hcat and hive-metastore was supposed to merge up somehow.
 What is the status on that? I remember that was one of the core reasons
 we
 brought it in.
 
 On Friday, July 26, 2013, Edward Capriolo edlinuxg...@gmail.com wrote:
 I prefer option 3 as well.
 
 
 On Fri, Jul 26, 2013 at 12:52 AM, Brock Noland br...@cloudera.com
 wrote:
 
 On Thu, Jul 25, 2013 at 9:48 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
 
 I have been developing my laptop on a duel core 2 GB Ram laptop for
 years
 now. With the addition of hcatalog, hive-thrift2, and some other
 growth
 trying to develop hive in a eclipse on this machine craws, especially
 if
 'build automatically' is turned on. As we look to add on more things
 this
 is only going to get worse.
 
 I am also noticing issues like this:
 
 https://issues.apache.org/jira/browse/HIVE-4849
 
 What I think we should do is strip down/out optional parts of hive.
 
 1) Hive Hbase
 This should really be it's own project to do this right we really
 have to
 have multiple branches since hbase is not backwards compatible.
 
 2) Hive Web Interface
 Now really a big project but not really critical can be just as
 easily
 be
 build separately
 
 3) hive thrift 1
 We have hive thrift 2 now, it is time for the sun to set on
 hivethrift1,
 
 4) odbc
 Not entirely convinced about this one but it is really not critical
 to
 running hive.
 
 What I think we should do is create sub-projects for the above things
 or
 simply move them into directories that do not build with hive.
 Ideally
 they
 would use maven to pull dependencies.
 
 What does everyone think?
 
 
 I agree that projects like the HBase handler and probably others as
 well
 should somehow be downstream projects which simply depend on the hive
 jars.  I see a couple alternatives for this:
 
 * Take the module in question to the Apache Incubator
 * Move the module in question to the Apache Extras
 * Breakup the projects within our own source tree
 
 I'd prefer the third option at this point.
 
 Brock
 
 
 
 Brock

Re: Tez branch and tez based patches

2013-08-05 Thread Alan Gates

Which talk are you referencing here?  AFAIK all the Hive code we've written is 
being pushed back into the Tez branch, so you should be able to see it there.

Alan.

On Jul 29, 2013, at 9:02 PM, Edward Capriolo wrote:

 At ~25:00
 
 There is a working prototype of hive which is using tez as the targeted
 runtime
 
 Can I get a look at that code? Is it on github?
 
 Edward
 
 
 On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates ga...@hortonworks.com wrote:
 
 Answers to some of your questions inlined.
 
 Alan.
 
 On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote:
 
 There are some points I want to bring up. First, I am on the PMC. Here is
 something I find relevant:
 
 http://www.apache.org/foundation/how-it-works.html
 
 --
 
 The role of the PMC from a Foundation perspective is oversight. The main
 role of the PMC is not code and not coding - but to ensure that all legal
 issues are addressed, that procedure is followed, and that each and every
 release is the product of the community as a whole. That is key to our
 litigation protection mechanisms.
 
 Secondly the role of the PMC is to further the long term development and
 health of the community as a whole, and to ensure that balanced and wide
 scale peer review and collaboration does happen. Within the ASF we worry
 about any community which centers around a few individuals who are
 working
 virtually uncontested. We believe that this is detrimental to quality,
 stability, and robustness of both code and long term social structures.
 
 
 
 
 https://blogs.apache.org/comdev/entry/what_makes_apache_projects_different
 
 -
 
 All other decisions happen on the dev list, discussions on the private
 list
 are kept to a minimum.
 
 If it didn't happen on the dev list, it didn't happen - which leads to:
 
 a) Elections of committers and PMC members are published on the dev list
 once finalized.
 
 b) Out-of-band discussions (IRC etc.) are summarized on the dev list as
 soon as they have impact on the project, code or community.
 -
 
 https://issues.apache.org/jira/browse/HIVE-4660 ironically titled Let
 their be Tez has not be +1 ed by any committer. It was never discussed
 on
 the dev or the user list (as far as I can tell).
 
 As all JIRA creations and updates are sent to dev@hive, creating a JIRA
 is de facto posting to the list.
 
 
 As a PMC member I feel we need more discussion on Tez on the dev list
 along
 with a wiki-fied design document. Topics of discussion should include:
 
 I talked with Gunther and he's working on posting a design doc on the
 wiki.  He has a PDF on the JIRA but he doesn't have write permissions yet
 on the wiki.
 
 
 1) What is tez?
 In Hadoop 2.0, YARN opens up the ability to have multiple execution
 frameworks in Hadoop.  Hadoop apps are no longer tied to MapReduce as the
 only execution option.  Tez is an effort to build an execution engine that
 is optimized for relational data processing, such as Hive and Pig.
 
 The biggest change here is to move away from only Map and Reduce as
 processing options and to allow alternate combinations of processing, such
 as map - reduce - reduce or tasks that take multiple inputs or shuffles
 that avoid sorting when it isn't needed.
 
 For a good intro to Tez, see Arun's presentation on it at the recent
 Hadoop summit (video http://www.youtube.com/watch?v=9ZLLzlsz7h8 slides
 http://www.slideshare.net/Hadoop_Summit/murhty-saha-june26255pmroom212)
 
 2) How is tez different from oozie, http://code.google.com/p/hop/,
 http://cs.brown.edu/~backman/cmr.html , and other DAG and or streaming
 map
 reduce tools/frameworks? Why should we use this and not those?
 
 Oozie is a completely different thing.  Oozie is a workflow engine and a
 scheduler.  It's core competencies are the ability to coordinate workflows
 of disparate job types (MR, Pig, Hive, etc.) and to schedule them.  It is
 not intended as an execution engine for apps such as Pig and Hive.
 
 I am not familiar with these other engines, but the short answer is that
 Tez is built to work on YARN, which works well for Hive since it is tied to
 Hadoop.
 
 3) When can we expect the first tez release?
 I don't know, but I hope sometime this fall.
 
 
 4) How much effort is involved in integrating hive and tez?
 Covered in the design doc.
 
 
 5) Who is ready to commit to this effort?
 I'll let people speak for themselves on that one.
 
 
 6) can we expect this work to be done in one hive release?
 Unlikely.  Initial integration will be done in one release, but as Tez is
 a new project I expect it will be adding features in the future that Hive
 will want to take advantage of.
 
 
 In my opinion we should not start any work on this tez-hive until these
 questions are answered to the satisfaction of the hive developers.
 
 Can we change this to not commit patches?  We can't tell willing people
 not to work

Re: Tez branch and tez based patches

2013-08-05 Thread Alan Gates


On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote:

 Also watched http://www.ustream.tv/recorded/36323173
 
 I definitely see the win in being able to stream inter-stage output.
 
 I see some cases where small intermediate results can be kept In memory.
 But I was somewhat under the impression that the map reduce spill settings
 kept stuff in memory, isn't that what spill settings are?

No.  MapReduce always writes shuffle data to local disk.  And intermediate 
results between MR jobs are always persisted to HDFS, as there's no other 
option.  When we talk of being able to keep intermediate results in memory we 
mean getting rid of both of these disk writes/reads when appropriate (meaning 
not always, there's a trade off between speed and error handling to be made 
here, see below for more details).

 
 There is a few bullet points that came up repeatedly that I do not follow:
 
 Something was said to the effect of Container reuse makes X faster.
 Hadoop has jvm reuse. Not following what the difference is here? Not
 everyone has a 10K node cluster.

Sharing JVMs across users is inherently insecure (we can't guarantee what code 
the first user left behind that may interfere with later users).  As I 
understand container re-use in Tez it constrains the re-use to one user for 
security reasons, but still avoids additional JVM start up costs.  But this is 
a question that the Tez guys could answer better on the Tez lists 
(d...@tez.incubator.apache.org)

 
 Joins in map reduce are hard Really? I mean some of them are I guess, but
 the typical join is very easy. Just shuffle by the join key. There was not
 really enough low level details here saying why joins are better in tez.

Join is not a natural operation in MapReduce.  MR gives you one input and one 
output.  You end up having to bend the rules to do have multiple inputs.  The 
idea here is that Tez can provide operators that naturally work with joins and 
other operations that don't fit the one input/one output model (eg unions, 
etc.).

 
 Chosing the number of maps and reduces is hard Really? I do not find it
 that hard, I think there are times when it's not perfect but I do not find
 it hard. The talk did not really offer anything here technical on how tez
 makes this better other then it could make it better.

Perhaps manual would be a better term here than hard.  In our experience it 
takes quite a bit of engineer trial and error to determine the optimal numbers. 
 This may be ok if you're going to invest the time once and then run the same 
query every day for 6 months.  But obviously it doesn't work for the ad hoc 
case.  Even in the batch case it's not optimal because every once and a while 
an engineer has to go back and re-optimize the query to deal with changing data 
sizes, data characteristics, etc.  We want the optimizer to handle this without 
human intervention.

 
 The presentations mentioned streaming data, how do two nodes stream data
 between a tasks and how it it reliable? If the sender or receiver dies does
 the entire process have to start again?

If the sender or receiver dies then the query has to be restarted from some 
previous point where data was persisted to disk.  The idea here is that speed 
vs error recovery trade offs should be made by the optimizer.  If the optimizer 
estimates that a query will complete in 5 seconds it can stream everything and 
if a node fails it just re-runs the whole query.  If it estimates that a 
particular phase of a query will run for an hour it can choose to persist the 
results to HDFS so that in the event of a failure downstream the long phase 
need not be re-run.  Again we want this to be done automatically by the system 
so the user doesn't need to control this level of detail.

 
 Again one of the talks implied there is a prototype out there that launches
 hive jobs into tez. I would like to see that, it might answer more
 questions then a power point, and I could profile some common queries.

As mentioned in a previous email afaik Gunther's pushed all these changes to 
the Tez branch in Hive.

Alan.

 
 Random late night thoughts over,
 Ed
 
 
 
 
 
 
 On Tue, Jul 30, 2013 at 12:02 AM, Edward Capriolo 
 edlinuxg...@gmail.comwrote:
 
 At ~25:00
 
 There is a working prototype of hive which is using tez as the targeted
 runtime
 
 Can I get a look at that code? Is it on github?
 
 Edward
 
 
 On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates ga...@hortonworks.com wrote:
 
 Answers to some of your questions inlined.
 
 Alan.
 
 On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote:
 
 There are some points I want to bring up. First, I am on the PMC. Here
 is
 something I find relevant:
 
 http://www.apache.org/foundation/how-it-works.html
 
 --
 
 The role of the PMC from a Foundation perspective is oversight. The main
 role of the PMC is not code and not coding - but to ensure that all
 legal
 issues are addressed, that procedure is followed, and that each and
 every

Re: [VOTE] Apache Hive 0.12.0 Release Candidate 0

2013-10-09 Thread Alan Gates

There's already a JIRA for this, 
https://issues.apache.org/jira/browse/HIVE-4731 it just needs a patch.  Given 
that Brock is working to move the build to Maven we should wait until that is 
done before adding this to the build.

Alan.

On Oct 8, 2013, at 5:13 PM, Mark Grover wrote:

 Thejas,
 Thanks for working on Hive 0.12 release!
 
 I work on Apache Bigtop http://bigtop.apache.org and we build rpm and deb
 packages by building and packaging the source tarballs. Most components (if
 not all) release a source tarball.
 
 Releasing a source tarball would make it make Hive consistent with other
 projects in terms of what is released, and would make life easier for those
 users who may not want binaries (like some Hive developers and Bigtop).
 
 I don't know how much work it will be, but both I personally and the larger
 Bigtop community would greatly appreciate if Hive released a source tarball
 for 0.12 release.
 
 Would love to hear what you think. Thanks again!
 Mark
 
 
 On Tue, Oct 8, 2013 at 3:56 PM, Thejas Nair the...@hortonworks.com wrote:
 
 On Tue, Oct 8, 2013 at 8:18 AM, Brock Noland br...@cloudera.com wrote:
 Hi Thejas,
 
 Again thank you very much for all the hard work!
 
 Two items of discussion:
 
 The tag contains .gitignore files so I believe the source tarball (src/
 directory) should as well.
 
 It is strange that other files files with . prefix do get included
 (.checkstyle, .arcconfig ), but .gitignore doesn't get included.
 
 This might be a wider item than the current release. However, our
 source
 tarball actually contains all the hive-*.jar files in addition to the all
 the libraries. Beyond that the source tarball actually doesn't match the
 tag
 structure, the src directory of the source tarball does.  I think we
 should
 change this at some point so the source tarball structure exactly matches
 the tag.
 
 Yes, I think we should address this for the next release. It might
 take some time to get this done right.
 
 
 
 Brock
 
 
 On Mon, Oct 7, 2013 at 11:02 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Carl pointed some issues with the RC. I will be rolling out a new RC
 to address those (hopefully sometime tomorrow).
 If anybody finds additional issues, please let me know, so that I can
 address those as well in the next RC.
 
 HIVE-5489 - NOTICE copyright dates are out of date
 HIVE-5488 - some files are missing apache license headers
 
 
 
 On Mon, Oct 7, 2013 at 4:38 PM, Thejas Nair the...@hortonworks.com
 wrote:
 Yes, that is the correct tag. Thanks for pointing it out.
 I also update the tag as it was a little behind what is in the RC
 (found some issues with maven-publish).
 
 I have also updated the release vote email template in hive
 HowToRelease wiki page, to include note about the tag .
 
 Thanks,
 Thejas
 
 
 
 On Mon, Oct 7, 2013 at 4:26 PM, Brock Noland br...@cloudera.com
 wrote:
 Hi Thejas,
 
 Thank you very much for the hard work!  I believe the vote email
 should
 contain a link to the tag we are voting on. I assume the tag is:
 release-0.12.0-rc0 (
 http://svn.apache.org/viewvc/hive/tags/release-0.12.0-rc0/). Is that
 correct?
 
 Brock
 
 
 On Mon, Oct 7, 2013 at 6:02 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Apache Hive 0.12.0 Release Candidate 0 is available here:
 http://people.apache.org/~thejas/hive-0.12.0-rc0/
 
 Maven artifacts are available here:
 
 https://repository.apache.org/content/repositories/orgapachehive-138/
 
 This release has 406 issues fixed.
 This includes several new features such as data types date and
 varchar, optimizer improvements, ORC format improvements and many
 bug
 fixes. Hcatalog packages have now moved to org.apache.hive.hcatalog
 (from org.apache.hcatalog), and the maven packages are published
 under
 org.apache.hive.hcatalog.
 
 Voting will conclude in 72 hours.
 
 Hive PMC Members: Please test and vote.
 
 Thanks,
 Thejas
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 
 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2013-01-21 Thread Alan Gates

Changes made.

Alan.

On Jan 21, 2013, at 12:39 PM, Carl Steinbach wrote:

Hi Alan,

Overall this looks good to me. I have a couple small suggestions:

* Replace occurrences of Hive's subversion repository with Hive's
source code repository.
* In the Actions table the sentence This also covers the creation of new
sub-projects within the project should be changed to This also covers the
creation of new sub-projects and sub-modules within the project.

Thanks.

Carl

On Fri, Jan 18, 2013 at 4:42 PM, Alan Gates ga...@hortonworks.com wrote:

I've created a wiki page for my proposed changes at
https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Bylaws+for+Submodule+Committers

Text to be removed is struck through. Text to be added is in italics.

Any recommended changes before we vote?

Alan.

On Jan 17, 2013, at 2:08 PM, Carl Steinbach wrote:

Sounds like a good plan to me. Since Ashutosh is a member of both the
Hive
and HCatalog PMCs it probably makes more sense for him to call the vote,
but I'm willing to do it too.

On Wed, Jan 16, 2013 at 8:24 AM, Alan Gates ga...@hortonworks.com
wrote:

If you think that's the best path forward that's fine. I can't call a
vote I don't think, since I'm not part of the Hive PMC. But I'm happy
to
draft a resolution for you and then let you call the vote. Should I do
that?

Alan.

On Jan 11, 2013, at 4:34 PM, Carl Steinbach wrote:

Hi Alan,

I agree that submitting this for a vote is the best option.

If anyone has additional proposed modifications please make them.
Otherwise I propose that the Hive PMC vote on this proposal.

In order for the Hive PMC to be able to vote on these changes they need
to be expressed in terms of one or more of the actions listed at the
end
of the Hive project bylaws:

https://cwiki.apache.org/confluence/display/Hive/Bylaws

So I think we first need to amend to the bylaws in order to define the
rights and privileges of a submodule committer, and then separately vote
the HCatalog committers in as Hive submodule committers. Does this make
sense?

Thanks.

Carl

Re: [VOTE] Amend Hive Bylaws + Add HCatalog Submodule

2013-02-04 Thread Alan Gates

Most excellent.  I'll start the vote in the HCatalog PPMC to approve this, and 
assuming that passes I'll then start a vote in the IPMC per the guidelines at 
http://incubator.apache.org/guides/graduation.html#subproject

Alan.

On Feb 4, 2013, at 2:27 PM, Carl Steinbach wrote:

 The following active Hive PMC members have cast votes:
 
 Carl Steinbach: +1, +1
 Ashutosh Chauhan: +1, +1
 Edward Capriolo: +1, +1
 Ashish Thusoo: +1, +1
 Yongqiang He: +1, +1
 Namit Jain: +1, +1
 
 Three active PMC members have abstained from voting.
 
 Over the last week the following four Hive PMC members requested
 that their status be changed from active to emeritus member:
 jvs, prasadc, zhao, pauly.
 
 Voting on these measures is now closed. Both measures have been approved
 with the required 2/3 majority of active Hive PMC members.
 
 Thanks.
 
 Carl
 
 On Thu, Jan 31, 2013 at 2:04 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:
 
 
 +1 and +1 non-binding.
 
 Great to see this happen!
 
 Thanks,
 +Vinod
 
 
 On Thu, Jan 31, 2013 at 12:14 AM, Namit Jain nj...@fb.com wrote:
 
 +1 and +1
 
 On 1/30/13 6:53 AM, Gunther Hagleitner ghagleit...@hortonworks.com
 wrote:
 
 +1 and +1
 
 Thanks,
 Gunther.
 
 
 On Tue, Jan 29, 2013 at 5:18 PM, Edward Capriolo
 edlinuxg...@gmail.comwrote:
 
 Measure 1: +1
 Measure 2: +1
 
 On Mon, Jan 28, 2013 at 2:47 PM, Carl Steinbach c...@apache.org
 wrote:
 
 I am calling a vote on the following two measures.
 
 Measure 1: Amend Hive Bylaws to Define Submodules and Submodule
 Committers
 
 If this measure passes the Apache Hive Project Bylaws will be
 amended with the following changes:
 
 
 
 
 
 https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive
 +Bylaws+for+Submodule+Committers
 
 The motivation for these changes is discussed in the following
 email thread which appeared on the hive-dev and hcatalog-dev
 mailing lists:
 
 http://markmail.org/thread/u5nap7ghvyo7euqa
 
 
 Measure 2: Create HCatalog Submodule and Adopt HCatalog Codebase
 
 This measure provides for 1) the establishment of an HCatalog
 submodule in the Apache Hive Project, 2) the adoption of the
 Apache HCatalog codebase into the Hive HCatalog submodule, and
 3) adding all currently active HCatalog committers as submodule
 committers on the Hive HCatalog submodule.
 
 Passage of this measure depends on the passage of Measure 1.
 
 
 Voting:
 
 Both measures require +1 votes from 2/3 of active Hive PMC
 members in order to pass. All participants in the Hive project
 are encouraged to vote on these measures, but only votes from
 active Hive PMC members are binding. The voting period
 commences immediately and shall last a minimum of six days.
 
 Voting is carried out by replying to this email thread. You must
 indicate which measure you are voting on in order for your vote
 to be counted.
 
 More details about the voting process can be found in the Apache
 Hive Project Bylaws:
 
 https://cwiki.apache.org/confluence/display/Hive/Bylaws
 
 
 
 
 
 
 
 --
 +Vinod
 Hortonworks Inc.
 http://hortonworks.com/

Fwd: [VOTE] Graduate HCatalog from the incubator and become part of Hive

2013-02-04 Thread Alan Gates

FYI.

Alan.

Begin forwarded message:

 From: Alan Gates ga...@hortonworks.com
 Date: February 4, 2013 10:18:09 PM PST
 To: hcatalog-...@incubator.apache.org
 Subject: [VOTE] Graduate HCatalog from the incubator and become part of Hive

 The Hive PMC has voted to accept HCatalog as a submodule of Hive.  You can 
 see the vote thread at 
 http://mail-archives.apache.org/mod_mbox/hive-dev/201301.mbox/%3cCACf6RrzktBYD0suZxn3Pfv8XkR=vgwszrzyb_2qvesuj2vh...@mail.gmail.com%3e
  .  We now need to vote to graduate from the incubator and become a submodule 
 of Hive.  This entails the following:

 1) the establishment of an HCatalog submodule in the Apache Hive Project;
 2) the adoption of the Apache HCatalog codebase into the Hive HCatalog 
 submodule; and
 3) adding all currently active HCatalog committers as submodule committers on 
 the Hive HCatalog submodule.

 Definitions for all these can be found in the (now adopted) Hive bylaws at 
 https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Bylaws+for+Submodule+Committer.

 This vote will stay open for at least 72 hours (thus 23:00 PST on 2/7/13).  
 PPMC members votes are binding in this vote, though input from all is welcome.

 If this vote passes the next step will be to submit the graduation motion to 
 the Incubator PMC.

 Here's my +1.

 Alan.

Merging HCatalog into Hive

2013-02-22 Thread Alan Gates

Alright, our vote has passed, it's time to get on with merging HCatalog into 
Hive.  Here's the things I can think of we need to deal with.  Please add 
additional issues I've missed:

1) Moving the code
2) Dealing with domain names in the code
3) The mailing lists
4) The JIRA
5) The website
6) Committer rights
7) Make a proposal for how HCat is released going forward
8) Publish an FAQ 

Proposals for how we handle these:
Below I propose an approach for how to handle each of these.  Feedback welcome.

1) Moving the code
I propose that HCat move into a subdirectory of Hive.  This fits nicely into 
Hive's structure since it already has metastore, ql, etc.  We'd just add 
'hcatalog' as a new directory.  This directory would contain hcatalog as it is 
today.  It does not follow Hive's standard build model so we'd need to do some 
work to make it so that building Hive also builds HCat, but this should be 
minimal.

2) Dealing with domain names
HCat code currently is under org.apache.hcatalog.  Do we want to change it?  In 
time we probably should change it to match the rest of Hive 
(org.apache.hadoop.hive.hcatalog).  We need to do this in a backward compatible 
way.  I propose we leave it as is for now and if we decide to in the future we 
can move the actual code to org.apache.hadoop.hive.hcatalog and create shell 
classes under org.apache.hcatalog.

3) The mailing lists
Given that our goal is to merge the projects and not create a subproject we 
should merge the mailing lists rather than keep hcat specific lists.  We can 
ask infra to remove hcatalog-*@incubator.apache.org and forward any new mail to 
the appropriate Hive lists.  We need to find out if they can auto-subscribe 
people from the hcat lists to the hive lists.  Given that traffic on the Hive 
lists is an order of magnitude higher we should warn people before we 
auto-subscribe them and allow them a chance to get off.

4) JIRA
We can create an hcatalog component in Hive's JIRA.  All new HCat issues could 
be filed there.  I don't know if there's a way to upload existing JIRAs into 
Hive's JIRA, but I think it would be better to leave them where they are.  We 
should see if infra can turn off the ability to create new JIRAs in hcatalog.

5) Website
We will need to integrate HCatalog's website with Hive's.  This should be easy 
except for the documentation.  HCat uses forrest for docs, Hive uses wiki.  We 
will need to put links under 'Documentation' for older versions of HCat docs so 
users can find them.  As far as how docs are handled for the next version of 
HCatalog, I think that depends on the answer to question 7 (next release of 
HCat), but I propose that HCat needs to conform to the way Hive does docs on 
wiki.  Though I would strongly encourage the HCat docs to be version specific 
(that is, have a set of wiki pages for each version).  
incubator.apache.org/hcatalog should be changed to forward to hive.apache.org.

6) Committer rights
Carl will need to set up committer rights for all the new HCat committers.  
Based on our discussion of making active HCat committers Hive submodule 
committers this would add the following set:  Alan, Sushanth, Francis, Daniel, 
Vandana, Travis, and Mithun.  Ashutosh and Paul are already Hive committers, 
and neither Devaraj nor Mac have been active in HCat in over a year.

7) Future releases
We need to discuss how future releases will happen, as I think this will help 
developers and users know how to respond to the merge.  I propose that HCat 
will simply become part of future Hive releases.  Thus Hive 0.11 (or whatever 
the next major release is) will include HCatalog.  If there are issues found we 
may need to make HCatalog 0.5.x releases from Hive, which should be fine.  But 
I propose there would not be an HCat 0.6.  To be clear I am not proposing that 
HCat functionality would be subsumed into Hive jars.  Just that the existing 
hcat jars would become part of Hive's release.

8) Communicate all of this
We should put up an FAQ page that has this information, as well as tracks our 
progress while we work on getting these things done.  

Alan.

Re: Merging HCatalog into Hive

2013-02-25 Thread Alan Gates


On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:

 Looks good from my perspective and I glad to see this moving forward.
 
 Regarding #4 (JIRA)
 
 I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
 but I think it would be better to leave them where they are.
 
 JIRA has a bulk move feature, but I am curious as why we would leave them
 under the old project? There might be good reason to orphan them, but my
 first thought is that it would be nice to have them under the HIVE project
 simply for search purposes.

I was thinking it would be hard for people who had bookmarks or pointers to the 
existing JIRAs.  Also, since it would change all the JIRA numbers on closed 
JIRAs it would make records from previous releases a mess.  But I see what 
you're saying about making search hard.  Maybe there's a way to leave the 
historical info where it is while importing any active JIRAs into Hive so 
people can search them.

Alan.

 
 Brock

Re: Merging HCatalog into Hive

2013-03-09 Thread Alan Gates

Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in 
a live conversation expressed his desire to move hcat into the Hive namespace 
sooner rather than later.  So the proposal is that we'd move the code to 
org.apache.hive.hcatalog, though we would create shell classes and interfaces 
in org.apache.hcatalog for all public classes and interfaces so that it will be 
backward compatible.  I'm fine with doing this now.

So, let's get started.  Carl, could you create an hcatalog directory under 
trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
started on moving the actual code.

Alan.

On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:

 Looks good from my perspective and I glad to see this moving forward.
 
 Regarding #4 (JIRA)
 
 I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
 but I think it would be better to leave them where they are.
 
 JIRA has a bulk move feature, but I am curious as why we would leave them
 under the old project? There might be good reason to orphan them, but my
 first thought is that it would be nice to have them under the HIVE project
 simply for search purposes.
 
 Brock
 
 
 
 
 On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote:
 
 Alright, our vote has passed, it's time to get on with merging HCatalog
 into Hive.  Here's the things I can think of we need to deal with.  Please
 add additional issues I've missed:
 
 1) Moving the code
 2) Dealing with domain names in the code
 3) The mailing lists
 4) The JIRA
 5) The website
 6) Committer rights
 7) Make a proposal for how HCat is released going forward
 8) Publish an FAQ
 
 Proposals for how we handle these:
 Below I propose an approach for how to handle each of these.  Feedback
 welcome.
 
 1) Moving the code
 I propose that HCat move into a subdirectory of Hive.  This fits nicely
 into Hive's structure since it already has metastore, ql, etc.  We'd just
 add 'hcatalog' as a new directory.  This directory would contain hcatalog
 as it is today.  It does not follow Hive's standard build model so we'd
 need to do some work to make it so that building Hive also builds HCat, but
 this should be minimal.
 
 2) Dealing with domain names
 HCat code currently is under org.apache.hcatalog.  Do we want to change
 it?  In time we probably should change it to match the rest of Hive
 (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
 compatible way.  I propose we leave it as is for now and if we decide to in
 the future we can move the actual code to org.apache.hadoop.hive.hcatalog
 and create shell classes under org.apache.hcatalog.
 
 3) The mailing lists
 Given that our goal is to merge the projects and not create a subproject
 we should merge the mailing lists rather than keep hcat specific lists.  We
 can ask infra to remove hcatalog-*@incubator.apache.org and forward any
 new mail to the appropriate Hive lists.  We need to find out if they can
 auto-subscribe people from the hcat lists to the hive lists.  Given that
 traffic on the Hive lists is an order of magnitude higher we should warn
 people before we auto-subscribe them and allow them a chance to get off.
 
 4) JIRA
 We can create an hcatalog component in Hive's JIRA.  All new HCat issues
 could be filed there.  I don't know if there's a way to upload existing
 JIRAs into Hive's JIRA, but I think it would be better to leave them where
 they are.  We should see if infra can turn off the ability to create new
 JIRAs in hcatalog.
 
 5) Website
 We will need to integrate HCatalog's website with Hive's.  This should be
 easy except for the documentation.  HCat uses forrest for docs, Hive uses
 wiki.  We will need to put links under 'Documentation' for older versions
 of HCat docs so users can find them.  As far as how docs are handled for
 the next version of HCatalog, I think that depends on the answer to
 question 7 (next release of HCat), but I propose that HCat needs to conform
 to the way Hive does docs on wiki.  Though I would strongly encourage the
 HCat docs to be version specific (that is, have a set of wiki pages for
 each version).  incubator.apache.org/hcatalog should be changed to
 forward to hive.apache.org.
 
 6) Committer rights
 Carl will need to set up committer rights for all the new HCat committers.
 Based on our discussion of making active HCat committers Hive submodule
 committers this would add the following set:  Alan, Sushanth, Francis,
 Daniel, Vandana, Travis, and Mithun.  Ashutosh and Paul are already Hive
 committers, and neither Devaraj nor Mac have been active in HCat in over a
 year.
 
 7) Future releases
 We need to discuss how future releases will happen, as I think this will
 help developers and users know how to respond to the merge.  I propose that
 HCat will simply become part of future Hive releases.  Thus Hive 0.11 (or
 whatever the next major release is) will include HCatalog.  If there are
 issues found we may need to make

Re: Review Request: HIVE-4145. Create hcatalog stub directory and add it to the build

2013-03-13 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9848/#review17816
---

Ship it!


Ship It!

- Alan Gates


On March 11, 2013, 4:27 a.m., Carl Steinbach wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/9848/
 ---
 
 (Updated March 11, 2013, 4:27 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 This patch creates an hcatalog stub directory. Alan requested this. Once the 
 patch is committed I will contact ASFINFRA and request that they grant karma 
 on the directory to the hcatalog submodule committers.
 
 
 This addresses bug HIVE-4145.
 https://issues.apache.org/jira/browse/HIVE-4145
 
 
 Diffs
 -
 
   build-common.xml e68ecea 
   build.properties 2d293a6 
   build.xml b5c69d3 
   hcatalog/build.xml PRE-CREATION 
   hcatalog/ivy.xml PRE-CREATION 
   hcatalog/src/java/org/apache/hive/hcatalog/package-info.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/9848/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Carl Steinbach

Re: Merging HCatalog into Hive

2013-03-13 Thread Alan Gates

Proposed changes look good to me.  And you don't need an infra ticket to grant 
karma.  Since you're Hive VP you can do it.  See 
http://www.apache.org/dev/pmc.html#SVNaccess

Alan.

On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:

 Hi Alan,
 
 I submitted a patch that creates the hcatalog directory and makes some other 
 necessary
 changes here:
 
 https://issues.apache.org/jira/browse/HIVE-4145
 
 Once this is committed I will contact ASFINFRA and ask them to grant the 
 HCatalog
 committers karma.
 
 Thanks.
 
 Carl
 
 On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote:
 Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl 
 in a live conversation expressed his desire to move hcat into the Hive 
 namespace sooner rather than later.  So the proposal is that we'd move the 
 code to org.apache.hive.hcatalog, though we would create shell classes and 
 interfaces in org.apache.hcatalog for all public classes and interfaces so 
 that it will be backward compatible.  I'm fine with doing this now.
 
 So, let's get started.  Carl, could you create an hcatalog directory under 
 trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
 started on moving the actual code.
 
 Alan.
 
 On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
 
  Looks good from my perspective and I glad to see this moving forward.
 
  Regarding #4 (JIRA)
 
  I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
  but I think it would be better to leave them where they are.
 
  JIRA has a bulk move feature, but I am curious as why we would leave them
  under the old project? There might be good reason to orphan them, but my
  first thought is that it would be nice to have them under the HIVE project
  simply for search purposes.
 
  Brock
 
 
 
 
  On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote:
 
  Alright, our vote has passed, it's time to get on with merging HCatalog
  into Hive.  Here's the things I can think of we need to deal with.  Please
  add additional issues I've missed:
 
  1) Moving the code
  2) Dealing with domain names in the code
  3) The mailing lists
  4) The JIRA
  5) The website
  6) Committer rights
  7) Make a proposal for how HCat is released going forward
  8) Publish an FAQ
 
  Proposals for how we handle these:
  Below I propose an approach for how to handle each of these.  Feedback
  welcome.
 
  1) Moving the code
  I propose that HCat move into a subdirectory of Hive.  This fits nicely
  into Hive's structure since it already has metastore, ql, etc.  We'd just
  add 'hcatalog' as a new directory.  This directory would contain hcatalog
  as it is today.  It does not follow Hive's standard build model so we'd
  need to do some work to make it so that building Hive also builds HCat, but
  this should be minimal.
 
  2) Dealing with domain names
  HCat code currently is under org.apache.hcatalog.  Do we want to change
  it?  In time we probably should change it to match the rest of Hive
  (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
  compatible way.  I propose we leave it as is for now and if we decide to in
  the future we can move the actual code to org.apache.hadoop.hive.hcatalog
  and create shell classes under org.apache.hcatalog.
 
  3) The mailing lists
  Given that our goal is to merge the projects and not create a subproject
  we should merge the mailing lists rather than keep hcat specific lists.  We
  can ask infra to remove hcatalog-*@incubator.apache.org and forward any
  new mail to the appropriate Hive lists.  We need to find out if they can
  auto-subscribe people from the hcat lists to the hive lists.  Given that
  traffic on the Hive lists is an order of magnitude higher we should warn
  people before we auto-subscribe them and allow them a chance to get off.
 
  4) JIRA
  We can create an hcatalog component in Hive's JIRA.  All new HCat issues
  could be filed there.  I don't know if there's a way to upload existing
  JIRAs into Hive's JIRA, but I think it would be better to leave them where
  they are.  We should see if infra can turn off the ability to create new
  JIRAs in hcatalog.
 
  5) Website
  We will need to integrate HCatalog's website with Hive's.  This should be
  easy except for the documentation.  HCat uses forrest for docs, Hive uses
  wiki.  We will need to put links under 'Documentation' for older versions
  of HCat docs so users can find them.  As far as how docs are handled for
  the next version of HCatalog, I think that depends on the answer to
  question 7 (next release of HCat), but I propose that HCat needs to conform
  to the way Hive does docs on wiki.  Though I would strongly encourage the
  HCat docs to be version specific (that is, have a set of wiki pages for
  each version).  incubator.apache.org/hcatalog should be changed to
  forward to hive.apache.org.
 
  6) Committer rights
  Carl will need to set up committer rights for all

Re: subscribe to your lists

2013-03-15 Thread Alan Gates

Send email to user-subscr...@hive.apache.org and dev-subscr...@hive.apache.org.

Alan.

On Mar 7, 2013, at 7:17 AM, Lin Picouleau wrote:

 Hi,
 
 I would like to subscribe to your lists to get involved with Hive project.
 
 Thank you!
 
 Lin Picouleau

Re: Merging HCatalog into Hive

2013-03-16 Thread Alan Gates

Excellent, thank you Carl.  I'll start on the process to move the code then.

Alan.

On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:

 Hi Alan,
 
 I committed HIVE-4145, created an HCatalog component on JIRA, and
 updated the asf-authorization-template to give the HCatalog committers
 karma on the hcatalog subdirectory. At this point I think everything should
 be ready to go. Let me know if you run into any problems.
 
 Thanks.
 
 Carl
 
 On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote:
 Proposed changes look good to me.  And you don't need an infra ticket to 
 grant karma.  Since you're Hive VP you can do it.  See 
 http://www.apache.org/dev/pmc.html#SVNaccess
 
 Alan.
 
 On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
 
  Hi Alan,
 
  I submitted a patch that creates the hcatalog directory and makes some 
  other necessary
  changes here:
 
  https://issues.apache.org/jira/browse/HIVE-4145
 
  Once this is committed I will contact ASFINFRA and ask them to grant the 
  HCatalog
  committers karma.
 
  Thanks.
 
  Carl
 
  On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote:
  Alright, I've gotten some feedback from Brock around the JIRA stuff and 
  Carl in a live conversation expressed his desire to move hcat into the Hive 
  namespace sooner rather than later.  So the proposal is that we'd move the 
  code to org.apache.hive.hcatalog, though we would create shell classes and 
  interfaces in org.apache.hcatalog for all public classes and interfaces so 
  that it will be backward compatible.  I'm fine with doing this now.
 
  So, let's get started.  Carl, could you create an hcatalog directory under 
  trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
  started on moving the actual code.
 
  Alan.
 
  On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
 
   Looks good from my perspective and I glad to see this moving forward.
  
   Regarding #4 (JIRA)
  
   I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
   but I think it would be better to leave them where they are.
  
   JIRA has a bulk move feature, but I am curious as why we would leave them
   under the old project? There might be good reason to orphan them, but my
   first thought is that it would be nice to have them under the HIVE project
   simply for search purposes.
  
   Brock
  
  
  
  
   On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote:
  
   Alright, our vote has passed, it's time to get on with merging HCatalog
   into Hive.  Here's the things I can think of we need to deal with.  
   Please
   add additional issues I've missed:
  
   1) Moving the code
   2) Dealing with domain names in the code
   3) The mailing lists
   4) The JIRA
   5) The website
   6) Committer rights
   7) Make a proposal for how HCat is released going forward
   8) Publish an FAQ
  
   Proposals for how we handle these:
   Below I propose an approach for how to handle each of these.  Feedback
   welcome.
  
   1) Moving the code
   I propose that HCat move into a subdirectory of Hive.  This fits nicely
   into Hive's structure since it already has metastore, ql, etc.  We'd just
   add 'hcatalog' as a new directory.  This directory would contain hcatalog
   as it is today.  It does not follow Hive's standard build model so we'd
   need to do some work to make it so that building Hive also builds HCat, 
   but
   this should be minimal.
  
   2) Dealing with domain names
   HCat code currently is under org.apache.hcatalog.  Do we want to change
   it?  In time we probably should change it to match the rest of Hive
   (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
   compatible way.  I propose we leave it as is for now and if we decide to 
   in
   the future we can move the actual code to org.apache.hadoop.hive.hcatalog
   and create shell classes under org.apache.hcatalog.
  
   3) The mailing lists
   Given that our goal is to merge the projects and not create a subproject
   we should merge the mailing lists rather than keep hcat specific lists.  
   We
   can ask infra to remove hcatalog-*@incubator.apache.org and forward any
   new mail to the appropriate Hive lists.  We need to find out if they can
   auto-subscribe people from the hcat lists to the hive lists.  Given that
   traffic on the Hive lists is an order of magnitude higher we should warn
   people before we auto-subscribe them and allow them a chance to get off.
  
   4) JIRA
   We can create an hcatalog component in Hive's JIRA.  All new HCat issues
   could be filed there.  I don't know if there's a way to upload existing
   JIRAs into Hive's JIRA, but I think it would be better to leave them 
   where
   they are.  We should see if infra can turn off the ability to create new
   JIRAs in hcatalog.
  
   5) Website
   We will need to integrate HCatalog's website with Hive's.  This should be
   easy except for the documentation.  HCat uses forrest

Re: Getting Started

2013-03-22 Thread Alan Gates

Check out https://cwiki.apache.org/confluence/display/Hive/HowToContribute

Alan.

On Mar 21, 2013, at 4:05 PM, Kole Reece wrote:

 Hi,
 
 What would be the best way to get started getting familiar and  making
 contributions.

Re: Merging HCatalog into Hive

2013-03-26 Thread Alan Gates

There's an issue with the permissions here.  In the authorization file you 
granted permission to hcatalog committers on a directory /hive/hcatalog.  But 
in Hive you created /hive/trunk/hcatalog, which none of the hcatalog committers 
can access.  In the authorization file you'll need to change hive-hcatalog to 
have authorization /hive/trunk/hcatalog.  

There is also a scalability issue.  Every time Hive branches you'll have to add 
a line for that branch as well.  Also, this will prohibit any dev branches for 
hcatalog users, or access to any dev branches done in Hive.  I suspect you'll 
find it much easier to give the hive-hcatalog group access to /hive and then 
use community mores to enforce that no hcat committers commit outside the hcat 
directory.

Alan.

On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:

 Hi Alan,
 
 I committed HIVE-4145, created an HCatalog component on JIRA, and
 updated the asf-authorization-template to give the HCatalog committers
 karma on the hcatalog subdirectory. At this point I think everything should
 be ready to go. Let me know if you run into any problems.
 
 Thanks.
 
 Carl
 
 On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote:
 Proposed changes look good to me.  And you don't need an infra ticket to 
 grant karma.  Since you're Hive VP you can do it.  See 
 http://www.apache.org/dev/pmc.html#SVNaccess
 
 Alan.
 
 On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
 
  Hi Alan,
 
  I submitted a patch that creates the hcatalog directory and makes some 
  other necessary
  changes here:
 
  https://issues.apache.org/jira/browse/HIVE-4145
 
  Once this is committed I will contact ASFINFRA and ask them to grant the 
  HCatalog
  committers karma.
 
  Thanks.
 
  Carl
 
  On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote:
  Alright, I've gotten some feedback from Brock around the JIRA stuff and 
  Carl in a live conversation expressed his desire to move hcat into the Hive 
  namespace sooner rather than later.  So the proposal is that we'd move the 
  code to org.apache.hive.hcatalog, though we would create shell classes and 
  interfaces in org.apache.hcatalog for all public classes and interfaces so 
  that it will be backward compatible.  I'm fine with doing this now.
 
  So, let's get started.  Carl, could you create an hcatalog directory under 
  trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
  started on moving the actual code.
 
  Alan.
 
  On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
 
   Looks good from my perspective and I glad to see this moving forward.
  
   Regarding #4 (JIRA)
  
   I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
   but I think it would be better to leave them where they are.
  
   JIRA has a bulk move feature, but I am curious as why we would leave them
   under the old project? There might be good reason to orphan them, but my
   first thought is that it would be nice to have them under the HIVE project
   simply for search purposes.
  
   Brock
  
  
  
  
   On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote:
  
   Alright, our vote has passed, it's time to get on with merging HCatalog
   into Hive.  Here's the things I can think of we need to deal with.  
   Please
   add additional issues I've missed:
  
   1) Moving the code
   2) Dealing with domain names in the code
   3) The mailing lists
   4) The JIRA
   5) The website
   6) Committer rights
   7) Make a proposal for how HCat is released going forward
   8) Publish an FAQ
  
   Proposals for how we handle these:
   Below I propose an approach for how to handle each of these.  Feedback
   welcome.
  
   1) Moving the code
   I propose that HCat move into a subdirectory of Hive.  This fits nicely
   into Hive's structure since it already has metastore, ql, etc.  We'd just
   add 'hcatalog' as a new directory.  This directory would contain hcatalog
   as it is today.  It does not follow Hive's standard build model so we'd
   need to do some work to make it so that building Hive also builds HCat, 
   but
   this should be minimal.
  
   2) Dealing with domain names
   HCat code currently is under org.apache.hcatalog.  Do we want to change
   it?  In time we probably should change it to match the rest of Hive
   (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
   compatible way.  I propose we leave it as is for now and if we decide to 
   in
   the future we can move the actual code to org.apache.hadoop.hive.hcatalog
   and create shell classes under org.apache.hcatalog.
  
   3) The mailing lists
   Given that our goal is to merge the projects and not create a subproject
   we should merge the mailing lists rather than keep hcat specific lists.  
   We
   can ask infra to remove hcatalog-*@incubator.apache.org and forward any
   new mail to the appropriate Hive lists.  We need to find out if they can
   auto-subscribe people from

Re: Merging HCatalog into Hive

2013-03-26 Thread Alan Gates

Cool, it works now.  Thanks for the fast response.

Alan.

On Mar 26, 2013, at 2:58 PM, Carl Steinbach wrote:

 Hi Alan,
 
 I agree that it will probably be too painful to enforce the rules with SVN, 
 so I went ahead and gave all of the HCatalog committers RW access to /hive. 
 Please follow the rules. If I receive any complaints about this I'll revert 
 back to the old scheme.
 
 Thanks.
 
 Carl
 
 On Tue, Mar 26, 2013 at 2:34 PM, Alan Gates ga...@hortonworks.com wrote:
 There's an issue with the permissions here.  In the authorization file you 
 granted permission to hcatalog committers on a directory /hive/hcatalog.  But 
 in Hive you created /hive/trunk/hcatalog, which none of the hcatalog 
 committers can access.  In the authorization file you'll need to change 
 hive-hcatalog to have authorization /hive/trunk/hcatalog.
 
 There is also a scalability issue.  Every time Hive branches you'll have to 
 add a line for that branch as well.  Also, this will prohibit any dev 
 branches for hcatalog users, or access to any dev branches done in Hive.  I 
 suspect you'll find it much easier to give the hive-hcatalog group access to 
 /hive and then use community mores to enforce that no hcat committers commit 
 outside the hcat directory.
 
 Alan.
 
 On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote:
 
  Hi Alan,
 
  I committed HIVE-4145, created an HCatalog component on JIRA, and
  updated the asf-authorization-template to give the HCatalog committers
  karma on the hcatalog subdirectory. At this point I think everything should
  be ready to go. Let me know if you run into any problems.
 
  Thanks.
 
  Carl
 
  On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote:
  Proposed changes look good to me.  And you don't need an infra ticket to 
  grant karma.  Since you're Hive VP you can do it.  See 
  http://www.apache.org/dev/pmc.html#SVNaccess
 
  Alan.
 
  On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote:
 
   Hi Alan,
  
   I submitted a patch that creates the hcatalog directory and makes some 
   other necessary
   changes here:
  
   https://issues.apache.org/jira/browse/HIVE-4145
  
   Once this is committed I will contact ASFINFRA and ask them to grant the 
   HCatalog
   committers karma.
  
   Thanks.
  
   Carl
  
   On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote:
   Alright, I've gotten some feedback from Brock around the JIRA stuff and 
   Carl in a live conversation expressed his desire to move hcat into the 
   Hive namespace sooner rather than later.  So the proposal is that we'd 
   move the code to org.apache.hive.hcatalog, though we would create shell 
   classes and interfaces in org.apache.hcatalog for all public classes and 
   interfaces so that it will be backward compatible.  I'm fine with doing 
   this now.
  
   So, let's get started.  Carl, could you create an hcatalog directory 
   under trunk/hive and grant the listed hcat committers karma on it?  Then 
   I'll get started on moving the actual code.
  
   Alan.
  
   On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:
  
Looks good from my perspective and I glad to see this moving forward.
   
Regarding #4 (JIRA)
   
I don't know if there's a way to upload existing JIRAs into Hive's 
JIRA,
but I think it would be better to leave them where they are.
   
JIRA has a bulk move feature, but I am curious as why we would leave 
them
under the old project? There might be good reason to orphan them, but my
first thought is that it would be nice to have them under the HIVE 
project
simply for search purposes.
   
Brock
   
   
   
   
On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com 
wrote:
   
Alright, our vote has passed, it's time to get on with merging HCatalog
into Hive.  Here's the things I can think of we need to deal with.  
Please
add additional issues I've missed:
   
1) Moving the code
2) Dealing with domain names in the code
3) The mailing lists
4) The JIRA
5) The website
6) Committer rights
7) Make a proposal for how HCat is released going forward
8) Publish an FAQ
   
Proposals for how we handle these:
Below I propose an approach for how to handle each of these.  Feedback
welcome.
   
1) Moving the code
I propose that HCat move into a subdirectory of Hive.  This fits nicely
into Hive's structure since it already has metastore, ql, etc.  We'd 
just
add 'hcatalog' as a new directory.  This directory would contain 
hcatalog
as it is today.  It does not follow Hive's standard build model so we'd
need to do some work to make it so that building Hive also builds 
HCat, but
this should be minimal.
   
2) Dealing with domain names
HCat code currently is under org.apache.hcatalog.  Do we want to change
it?  In time we probably should change it to match the rest of Hive
(org.apache.hadoop.hive.hcatalog

Re: Moving code to Hive NOW

2013-03-26 Thread Alan Gates

I've moved the code.  I'll be moving a lot of other code around over the next 
few days as I do what we discussed in 
https://issues.apache.org/jira/browse/HIVE-4198 so don't rebase your patches 
just yet.

Alan.

On Mar 26, 2013, at 3:14 PM, Alan Gates wrote:

 I am going to move the HCatalog code to Hive in the next few minutes.  Please 
 don't check anything into HCatalog until this is done.  All patches will be 
 invalidated by this move.  I'll send an all clear when this is done.
 
 Alan.

Where to put hcatalog branches and site code

2013-03-26 Thread Alan Gates

Right after I moved the hcat code to hive/trunk/hcatalog Owen pointed out that 
the problem with this is now everyone who checks out Hive pulls _all_ of the 
hcat code.  This isn't what we want.

The site code I propose we integrate with Hive's site code.  I'll put up a 
patch for this shortly.

The branches we could either move into Hive's branches directory (and move them 
to hcatalog-branch-0.x) or we could create a /hive/hcatalog-historical and put 
them there.  I'm fine with either.  Thoughts?

Alan.

Re: HCatalog to Hive Committership

2013-04-22 Thread Alan Gates

Asking for volunteers is probably better than making assignments.  We have 7 
HCatalog committers (http://hive.apache.org/credits.html).  If current Hive 
committers/PMC members could volunteer to mentor a committer we can start 
assigning mentors to pupils.  Seem reasonable?

Alan.

On Apr 16, 2013, at 6:10 PM, Carl Steinbach wrote:

 
HCatalog committers will be assigned shepherds - How do we go about
getting assigned one?
 
 I recommend asking Alan about this since he wrote the proposal. 
 
 
 Alan, who is responsible for making these assignments?
 
 Thanks.
 
 Carl

Re: Is HCatalog stand-alone going to die ?

2013-04-30 Thread Alan Gates


On Apr 29, 2013, at 10:25 PM, Rodrigo Trujillo wrote:

 Hi,
 
 I have followed the discussion about the merging of HCatalog into Hive.
 However, it is not clear to me whether new stand-alone versions of Hcatalog
 are going to be released.
 
 Is 0.5.0-incubating the last ?

0.5.0 is the last planned stand alone release of HCatalog.  The next version of 
HCatalog will be included in Hive 0.11.

 
 Will be possible to build only hcatalog from Hive tree ?
No, building HCatalog already depends on building Hive.

Alan.

 
 Regards,
 
 Rodrigo Trujillo

Review Request: HIVE-4500 HS2 holding too many file handles of hive_job_log_hive_*.txt files

2013-05-06 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10954/
---

Review request for hive and Carl Steinbach.


Description
---

HS2 holding too many file handles of hive_job_log_hive_*.txt files


This addresses bug HIVE-4500.
https://issues.apache.org/jira/browse/HIVE-4500


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java 1478219 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1478219 
  
trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
 1478219 
  trunk/service/src/java/org/apache/hive/service/cli/operation/Operation.java 
1478219 
  
trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
1478219 

Diff: https://reviews.apache.org/r/10954/diff/


Testing
---


Thanks,

Alan Gates

Re: [VOTE] Apache Hive 0.11.0 Release Candidate 2

2013-05-15 Thread Alan Gates

+1.

Downloaded and built it.  Ran the HCat and webhcat system tests (which include 
a number of Hive tests).  Everything looks good.

Alan.

On May 11, 2013, at 10:33 AM, Owen O'Malley wrote:

 Based on feedback from everyone, I have respun release candidate, RC2.
 Please take a look. We've fixed 7 problems with the previous RC:
 * Release notes were incorrect
 * HIVE-4018 - MapJoin failing with Distributed Cache error
 * HIVE-4421 - Improve memory usage by ORC dictionaries
 * HIVE-4500 - Ensure that HiveServer 2 closes log files.
 * HIVE-4494 - ORC map columns get class cast exception in some contexts
 * HIVE-4498 - Fix TestBeeLineWithArgs failure
 * HIVE-4505 - Hive can't load transforms with remote scripts
 * HIVE-4527 - Fix the eclipse template
 
 Source tag for RC2 is at:
 
 https://svn.apache.org/repos/asf/hive/tags/release-0.11.0rc2
 
 
 Source tar ball and convenience binary artifacts can be found
 at: http://people.apache.org/~omalley/hive-0.11.0rc2/
 
 This release has many goodies including HiveServer2, integrated
 hcatalog, windowing and analytical functions, decimal data type,
 better query planning, performance enhancements and various bug fixes.
 In total, we resolved more than 350 issues. Full list of fixed issues
 can be found at:  http://s.apache.org/8Fr
 
 
 Voting will conclude in 72 hours.
 
 Hive PMC Members: Please test and vote.
 
 Thanks,
 
 Owen

Re: VOTE: Remove phabricator instructions from hive-development guide (wiki), officially only support Apache's review board.

2013-10-17 Thread Alan Gates

Major +1 (non-binding).  Using 3rd party tools where we have no option for 
support or help is not good.

Alan.

On Oct 16, 2013, at 5:32 PM, Edward Capriolo wrote:

 Our wiki has instructions for posting to phabricator for code reviews.
 https://cwiki.apache.org/confluence/display/Hive/PhabricatorCodeReview
 
 Phabricator now requires an external facebook account to review patches,
 and we have no technical support contact where phabricator is hosted. It
 also seems like some of the phabricator features are no longer working.
 
 Apache has a review board system many people are already using.
 https://reviews.apache.org/account/login/?next_page=/dashboard/
 
 This vote is to remove the phabricator instructions from the wiki. The
 instructions will reference review board and that will be the only system
 that Hive supports for patch review process.
 
 +1 is a vote for removing the phabricator instructions from the wiki.
 
 Thank you,
 Edward


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Maven unit test question

2013-12-09 Thread Alan Gates

I was attempting to write unit tests for changes I'm making to 
HiveMetaStoreClient as part of the ACID transaction work (see 
https://issues.apache.org/jira/browse/HIVE-5843).  When I added the tests and 
attempted to run them using 
mvn tests -Dtest=TestHiveMetaStoreClient -Phadoop-1 

it failed with:

java.lang.NoClassDefFoundError: 
org/apache/hadoop/hive/thrift/TUGIContainingTransport$Factory

This class is contained in the hive-shims jar.  The error surprised me because 
according to metastore/pom.xml, hive-shims is a dependency of hive-metastore.  
When I ran maven with -X to get debug information, I found that in the 
classpath it was including 
/Users/gates/git/apache/hive/shims/assembly/target/classes.  I'm guessing that 
rather than use the shims jar (which has been built by this time) it's trying 
to use the compiled classes, but failing in this case because the shims jar is 
actually constructed not by directly conglomerating a set of class files but by 
picking and choosing from several shim jar versions and then constructing a 
single jar.  But I could not figure out how to communicate to maven that is 
should use the already built shims jar rather than the classes.  To test my 
theory I took the shims jar and unpacked in the path maven was looking in, and 
sure enough my tests ran once I did that.

The existing unit test TestMetastoreExpr in ql seems to have the same issue.  I 
tried to use it as a model, but when I ran it it failed with the same error, 
and unpacking the jar resolved it in the same way.

Am I doing something wrong, or is there a change needed in the pom.xml to get 
it to look in the jar instead of the .class files for shims dependencies?

Alan.
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Maven unit test question

2013-12-10 Thread Alan Gates

There's a patch on https://issues.apache.org/jira/browse/HIVE-5843 that has the 
code.  Unfortunately the patch is huge because my work involves changes to the 
thrift interface.  The git repo I'm working off of last pulled from Apache on 
Nov 20th (commit 1de88eedc69af1e7e618fc4f5eac045f69c02973 is the last one it 
has) so you may need to go back a bit in your repo to get a version that the 
patch will apply against.

The test in question is TestHiveMetaStoreClient.

Also, I had to move the class from metastore to ql as it turns out 
instantiating the HiveMetaStoreClient needs the hive-exec jar.  I couldn't add 
a dependency on hive-exec in the metastore package as hive-exec depends on 
hive-metastore.

Thanks for your help.

Alan.

On Dec 9, 2013, at 3:53 PM, Brock Noland wrote:

 Can you share the change with me so I can debug?
 On Dec 9, 2013 5:15 PM, Alan Gates ga...@hortonworks.com wrote:
 
 I was attempting to write unit tests for changes I'm making to
 HiveMetaStoreClient as part of the ACID transaction work (see
 https://issues.apache.org/jira/browse/HIVE-5843).  When I added the tests
 and attempted to run them using
 mvn tests -Dtest=TestHiveMetaStoreClient -Phadoop-1
 
 it failed with:
 
 java.lang.NoClassDefFoundError:
 org/apache/hadoop/hive/thrift/TUGIContainingTransport$Factory
 
 This class is contained in the hive-shims jar.  The error surprised me
 because according to metastore/pom.xml, hive-shims is a dependency of
 hive-metastore.  When I ran maven with -X to get debug information, I found
 that in the classpath it was including
 /Users/gates/git/apache/hive/shims/assembly/target/classes.  I'm guessing
 that rather than use the shims jar (which has been built by this time) it's
 trying to use the compiled classes, but failing in this case because the
 shims jar is actually constructed not by directly conglomerating a set of
 class files but by picking and choosing from several shim jar versions and
 then constructing a single jar.  But I could not figure out how to
 communicate to maven that is should use the already built shims jar rather
 than the classes.  To test my theory I took the shims jar and unpacked in
 the path maven was looking in, and sure enough my tests ran once I did that.
 
 The existing unit test TestMetastoreExpr in ql seems to have the same
 issue.  I tried to use it as a model, but when I ran it it failed with the
 same error, and unpacking the jar resolved it in the same way.
 
 Am I doing something wrong, or is there a change needed in the pom.xml to
 get it to look in the jar instead of the .class files for shims
 dependencies?
 
 Alan.
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: adding ANSI flag for hive

2013-12-16 Thread Alan Gates

A couple of thoughts on this:

1) If we did this I think we should have one flag, not many.  As Thejas points 
out, your test matrix goes insane when you have too many flags and hence things 
don't get properly tested.

2) We could do this in an incremental way, where we create this new ANSI flag 
and are clear with users that for a while this will be evolving.  That is, as 
we find new issues with data types, semantics, whatever, we will continue to 
change the behavior of this flag.  At some point in the future (as Thejas 
suggests, at a 1.0 release) we could make this the default behavior.  This 
avoids having to do a full sweep now and find everything that we want to change 
and make ANSI compliant and living with whatever we miss.

Alan.

On Dec 11, 2013, at 5:14 PM, Thejas Nair wrote:

 Having too many configs complicates things for the user, and also
 complicates the code, and you also end up having many untested
 combinations of config flags.
 I think we should identify a bunch of non compatible changes that we
 think are important, fix it in a branch and make a major version
 release (say 1.x).
 
 This is also related to HIVE-5875, where there is a discussion on
 switching the defaults for some of the configs to more desirable
 values, but non backward compatible values.
 
 On Wed, Dec 11, 2013 at 4:33 PM, Sergey Shelukhin
 ser...@hortonworks.com wrote:
 Hi.
 
 There's recently been some discussion about data type changes in Hive
 (double to decimal), and result changes for special cases like division by
 zero, etc., to bring it in compliance with MySQL (that's what JIRAs use an
 example; I am assuming ANSI SQL is meant).
 The latter are non-controversial (I guess), but for the former, performance
 may suffer and/or backward compat may be broken if Hive is brought in
 compliance.
 If fuller ANSI compat is sought in the future, there may be some even
 hairier issues such as double-quoted identifiers.
 
 In light of that, and also following MySQL, I wonder if we should add a
 flag, or set of flags, to HIVE to be able to force ANSI compliance.
 When this/ese flag/s is/are not set, for example, int/int division could
 return double for backward compat/perf, vectorization can skip the special
 case handling for division by zero/etc., etc.
 Wdyt?
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Mail bounces from ebuddy.com

2014-08-18 Thread Alan Gates

Anyone who is an admin on the list (I don't who the admins are) can do 
this by doing user-unsubscribe-USERNAME=ebuddy@hive.apache.org where 
USERNAME is the name of the bouncing user (see 
http://untroubled.org/ezmlm/ezman/ezman1.html )


Alan.




Thejas Nair mailto:the...@hortonworks.com
August 17, 2014 at 17:02
I don't know how to do this.

Carl, Ashutosh,
Do you guys know how to remove these two invalid emails from the 
mailing list ?



Lars Francke mailto:lars.fran...@gmail.com
August 17, 2014 at 15:41
Hmm great, I see others mentioning this as well. I'm happy to contact 
INFRA

but I'm not sure if they are even needed or if someone from the Hive team
can do this?


On Fri, Aug 8, 2014 at 3:43 AM, Lefty Leverenz leftylever...@gmail.com

Lefty Leverenz mailto:leftylever...@gmail.com
August 7, 2014 at 18:43
(Excuse the spam.) Actually I'm getting two bounces per message, but gmail
concatenates them so I didn't notice the second one.

-- Lefty


On Thu, Aug 7, 2014 at 9:36 PM, Lefty Leverenz leftylever...@gmail.com

Lefty Leverenz mailto:leftylever...@gmail.com
August 7, 2014 at 18:36
Curious, I've only been getting one bounce per message. Anyway thanks for
bringing this up.

-- Lefty



Lars Francke mailto:lars.fran...@gmail.com
August 7, 2014 at 4:38
Hi,

every time I send a mail to dev@ I get two bounce mails from two people at
ebuddy.com. I don't want to post the E-Mail addresses publicly but I can
send them on if needed (and it can be triggered easily by just replying to
this mail I guess).

Could we maybe remove them from the list?

Cheers,
Lars



--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Timeline for release of Hive 0.14

2014-08-22 Thread Alan Gates

+1, Eugene and I are working on getting HIVE-5317 (insert, update, 
delete) done and would like to get it in.


Alan.


Nick Dimiduk mailto:ndimi...@gmail.com
August 20, 2014 at 12:27
It'd be great to get HIVE-4765 included in 0.14. The proposed changes 
are a

big improvement for us HBase folks. Would someone mind having a look in
that direction?

Thanks,
Nick



Thejas Nair mailto:the...@hortonworks.com
August 19, 2014 at 15:20
+1
Sounds good to me.
Its already almost 4 months since the last release. It is time to
start preparing for the next one.
Thanks for volunteering!


Vikram Dixit mailto:vik...@hortonworks.com
August 19, 2014 at 14:02
Hi Folks,

I was thinking that it was about time that we had a release of hive 0.14
given our commitment to having a release of hive on a periodic basis. We
could cut a branch and start working on a release in say 2 weeks time
around September 5th (Friday). After branching, we can focus on 
stabilizing

for the release and hopefully have an RC in about 2 weeks post that. I
would like to volunteer myself for the duties of the release manager for
this version if the community agrees.

Thanks
Vikram.



--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Review Request 25245: Support dynamic service discovery for HiveServer2

2014-09-03 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25245/#review52171
---



service/src/java/org/apache/hive/service/server/HiveServer2.java
https://reviews.apache.org/r/25245/#comment90921

It seems like we want more than warn here if we fail to create the parent 
node.  In this case we'll be unable to create the node for this instance, and 
clients will be unable to find the server.  I would think this should be fatal.



service/src/java/org/apache/hive/service/server/HiveServer2.java
https://reviews.apache.org/r/25245/#comment90922

Agree we should have a clean shutdown case.  The timeout was 3 minutes I 
think, which means it will be a while after the system shuts down that clients 
keep trying to contact it.


- Alan Gates


On Sept. 2, 2014, 10:05 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25245/
 ---
 
 (Updated Sept. 2, 2014, 10:05 a.m.)
 
 
 Review request for hive, Alan Gates, Navis Ryu, Szehon Ho, and Thejas Nair.
 
 
 Bugs: HIVE-7935
 https://issues.apache.org/jira/browse/HIVE-7935
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-7935
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java cbcfec7 
   jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  46044d0 
   ql/src/java/org/apache/hadoop/hive/ql/util/ZooKeeperHiveHelper.java 
 PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/lockmgr/zookeeper/TestZookeeperLockManager.java
  59294b1 
   service/src/java/org/apache/hive/service/cli/CLIService.java 08ed2e7 
   
 service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
 21c33bc 
   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
 bc0a02c 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 d573592 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  37b05fc 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 027931e 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 c380b69 
   service/src/java/org/apache/hive/service/server/HiveServer2.java 0864dfb 
   
 service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java
  66fc1fc 
 
 Diff: https://reviews.apache.org/r/25245/diff/
 
 
 Testing
 ---
 
 Manual testing + test cases.
 
 
 Thanks,
 
 Vaibhav Gumashta

Review Request 25341: HIVE-7078 Need file sink operators that work with ACID

2014-09-04 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25341/
---

Review request for hive and Prasanth_J.


Bugs: HIVE-7078
https://issues.apache.org/jira/browse/HIVE-7078


Repository: hive-git


Description
---

Changes FileSinkOperator to use RecordUpdater in cases where an ACID write is 
being done.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java d4e61d8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java f584926 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 301dde5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25341/diff/


Testing
---

Added a new unit test TestFileSinkOperator that tests writing of standard 
(non-ACID) data via RecordWriter and acid data via RecordUpdater, in both 
partitioned and non-partitioned cases.


Thanks,

Alan Gates

Review Request 25343: HIVE-7899 txnMgr should be session specific

2014-09-04 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25343/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-7899
https://issues.apache.org/jira/browse/HIVE-7899


Repository: hive-git


Description
---

This patch moves the TxnManager instance from Driver to SessionState, since 
multiple queries can share a single session, and it is convenient in other 
parts of the code to be able to get to the transaction manager.  It also stores 
the current transaction id and whether we are in autocommit in SessionState.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 0533ae8 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java df66f83 

Diff: https://reviews.apache.org/r/25343/diff/


Testing
---

Ran existing transaction manager tests.


Thanks,

Alan Gates

Re: Review Request 25341: HIVE-7078 Need file sink operators that work with ACID

2014-09-05 Thread Alan Gates



 On Sept. 5, 2014, 8:21 a.m., Prasanth_J wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java, line 211
  https://reviews.apache.org/r/25341/diff/1/?file=676838#file676838line211
 
  I don't see needToRename being used elsewhere. So can you replace this 
  chunk with 
  if (fs.exists(outPaths[idx])  !fs.rename(outPaths[idx], 
  finalPaths[idx]) {..}
  ?

But that would change the behavior in the standard case.  This way the behavior 
is only changed in the update and delete case.  I didn't want to add an extra 
stat for every type of write.


 On Sept. 5, 2014, 8:21 a.m., Prasanth_J wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java, line 556
  https://reviews.apache.org/r/25341/diff/1/?file=676838#file676838line556
 
  Can you add a comment what is happening here? Are you just stripping 
  off the _attemptId from taskId_attemptId? If so can you use 
  Utilities.getTaskIdFromFilename() instead?

Switched to Utilities.getTaskIdFromFilename() as requested.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25341/#review52339
---


On Sept. 4, 2014, 3:49 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25341/
 ---
 
 (Updated Sept. 4, 2014, 3:49 p.m.)
 
 
 Review request for hive and Prasanth_J.
 
 
 Bugs: HIVE-7078
 https://issues.apache.org/jira/browse/HIVE-7078
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Changes FileSinkOperator to use RecordUpdater in cases where an ACID write is 
 being done.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java d4e61d8 
   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java f584926 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 
   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 301dde5 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25341/diff/
 
 
 Testing
 ---
 
 Added a new unit test TestFileSinkOperator that tests writing of standard 
 (non-ACID) data via RecordWriter and acid data via RecordUpdater, in both 
 partitioned and non-partitioned cases.
 
 
 Thanks,
 
 Alan Gates

Re: Review Request 25341: HIVE-7078 Need file sink operators that work with ACID

2014-09-05 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25341/
---

(Updated Sept. 5, 2014, 4:36 p.m.)


Review request for hive and Prasanth_J.


Bugs: HIVE-7078
https://issues.apache.org/jira/browse/HIVE-7078


Repository: hive-git


Description
---

Changes FileSinkOperator to use RecordUpdater in cases where an ACID write is 
being done.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java d4e61d8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java f584926 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 301dde5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25341/diff/


Testing
---

Added a new unit test TestFileSinkOperator that tests writing of standard 
(non-ACID) data via RecordWriter and acid data via RecordUpdater, in both 
partitioned and non-partitioned cases.


Thanks,

Alan Gates

Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-06 Thread Alan Gates

 PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_non_partitioned.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_partitioned.q PRE-CREATION 
  ql/src/test/results/clientnegative/delete_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_partition_col.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_whole_partition.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_update_delete.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_whole_partition.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_update_delete.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/tez/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_tmp_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-06 Thread Alan Gates

/clientpositive/update_where_no_match.q PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_non_partitioned.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_partitioned.q PRE-CREATION 
  ql/src/test/results/clientnegative/delete_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_partition_col.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_whole_partition.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_update_delete.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_whole_partition.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_update_delete.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/tez/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_tmp_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-06 Thread Alan Gates

/clientpositive/update_where_no_match.q PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_non_partitioned.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/update_where_partitioned.q PRE-CREATION 
  ql/src/test/results/clientnegative/delete_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_not_acid.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/update_partition_col.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/delete_whole_partition.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_update_delete.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/insert_values_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/delete_whole_partition.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_update_delete.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/tez/insert_values_dynamic_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_orig_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/insert_values_tmp_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_no_match.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/tez/update_where_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_after_multiple_inserts.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_partitioned.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_all_types.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_orig_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_tmp_table.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_two_cols.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_no_match.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-06 Thread Alan Gates



 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  I obivously don't have context here but I do have a few items which I think 
  should be addressed.
  
  Thx!

Thanks for the review.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 305
  https://reviews.apache.org/r/25414/diff/1/?file=682007#file682007line305
 
  I think this should be HIVE_IN_TEZ_TEST

:), will fix


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/Context.java, line 105
  https://reviews.apache.org/r/25414/diff/1/?file=682012#file682012line105
 
  there is a setter/getter for this field so I think it can be private.

Ok.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java, line 275
  https://reviews.apache.org/r/25414/diff/1/?file=682016#file682016line275
 
  assert is almost never enabled. Should we use preconditions?

I put this here as a way to test while I was developing, and left it because it 
helped make clear to later maintainers what I was expecting.  I avoided doing 
an explicit instanceof check for performance.  If you think it's important I 
can put it in there without the assert and then throw an exception.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java, line 52
  https://reviews.apache.org/r/25414/diff/1/?file=682019#file682019line52
 
  constants should be all caps. If we fix this one can we fix 
  bucketFileFilter  as well.

I'm fine to change it, except that all of the other filters in the file aren't, 
so I was matching existing style.  We might want to file a separate JIRA to fix 
them all, which should be a quick patch.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2404
  https://reviews.apache.org/r/25414/diff/1/?file=682024#file682024line2404
 
  seems like we might want to log the exception here

Agreed, will fix.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2417
  https://reviews.apache.org/r/25414/diff/1/?file=682024#file682024line2417
 
  seems like we might want to log the exception here

Agreed, will fix.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2436
  https://reviews.apache.org/r/25414/diff/1/?file=682024#file682024line2436
 
  hmm, why not just log as INFO or DEBUG?

Will add an INFO message.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2449
  https://reviews.apache.org/r/25414/diff/1/?file=682024#file682024line2449
 
  no need for stringifyException here, you can pass e as a second arg

Will fix.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java, line 
  99
  https://reviews.apache.org/r/25414/diff/1/?file=682028#file682028line99
 
  looks like this can be final

Sure, but why?


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 747
  https://reviews.apache.org/r/25414/diff/1/?file=682029#file682029line747
 
  why not log the exception as well

Do you mean the exception stack or the exception name?  The exception message 
is getting logged, since it's in errMsg.  I'm throwing a SemanticException that 
includes the caught exception, so I'm assuming the stack will be printed when 
that is dumped.


 On Sept. 6, 2014, 5:43 p.m., Brock Noland wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java,
   line 333
  https://reviews.apache.org/r/25414/diff/1/?file=682031#file682031line333
 
  I think we should change this to IllegalStateException

The dangers of using comedy in your error messages is that you'll forget to go 
back and put something useful in.  Will fix.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52542
---


On Sept. 6, 2014, 4:32 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 6, 2014, 4:32 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work

Re: Timeline for release of Hive 0.14

2014-09-08 Thread Alan Gates

I'll review that.  I just need the time to test it against mysql, 
oracle, and hopefully sqlserver.  But I think we can do this post branch 
if we need to, as it's a bug fix rather than a feature.


Alan.


Damien Carol mailto:dca...@blitzbs.com
September 8, 2014 at 3:19
Same request for https://issues.apache.org/jira/browse/HIVE-7689

I already provided a patch, re-based it many times and I'm waiting for 
a review.


Regards,

Le 08/09/2014 12:08, amareshwarisr . a écrit :

amareshwarisr . mailto:amareshw...@gmail.com
September 8, 2014 at 3:08
Would like to include https://issues.apache.org/jira/browse/HIVE-2390 and
https://issues.apache.org/jira/browse/HIVE-7936.

I can review and merge them.

Thanks
Amareshwari



Vikram Dixit mailto:vik...@hortonworks.com
September 5, 2014 at 17:53
Hi Folks,

I am going to start consolidating the items mentioned in this list and
create a wiki page to track it. I will wait till the end of next week to
create the branch taking into account Ashutosh's request.

Thanks
Vikram.


On Fri, Sep 5, 2014 at 5:39 PM, Ashutosh Chauhan hashut...@apache.org

Ashutosh Chauhan mailto:hashut...@apache.org
September 5, 2014 at 17:39
Vikram,

Some of us are working on stabilizing cbo branch and trying to get it
merged into trunk. We feel we are close. May I request to defer 
cutting the

branch for few more days? Folks interested in this can track our progress
here : https://issues.apache.org/jira/browse/HIVE-7946

Thanks,
Ashutosh


On Fri, Aug 22, 2014 at 4:09 PM, Lars Francke lars.fran...@gmail.com

Lars Francke mailto:lars.fran...@gmail.com
August 22, 2014 at 16:09
Thank you for volunteering to do the release. I think a 0.14 release is a
good idea.

I have a couple of issues I'd like to get in too:

* Either HIVE-7107[0] (Fix an issue in the HiveServer1 JDBC driver) or
HIVE-6977[1] (Delete HiveServer1). The former needs a review the latter a
patch
* HIVE-6123[2] Checkstyle in Maven needs a review

HIVE-7622[3]  HIVE-7543[4] are waiting for any reviews or comments on my
previous thread[5]. I'd still appreciate any helpers for reviews or even
just comments. I'd feel very sad if I had done all that work for nothing.
Hoping this thread gives me a wider audience. Both patches fix up issues
that should have been caught in earlier reviews as they are almost all
Checkstyle or other style violations but they make for huge patches. I
could also create hundreds of small issues or stop doing these things
entirely



[0] https://issues.apache.org/jira/browse/HIVE-7107
[1] https://issues.apache.org/jira/browse/HIVE-6977
[2] https://issues.apache.org/jira/browse/HIVE-6123
[3] https://issues.apache.org/jira/browse/HIVE-7622
[4] https://issues.apache.org/jira/browse/HIVE-7543

On Fri, Aug 22, 2014 at 11:01 PM, John Pullokkaran 



--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-08 Thread Alan Gates

/update_where_non_partitioned.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-10 Thread Alan Gates

/hive/ql/session/SessionState.java, line 1288
  https://reviews.apache.org/r/25414/diff/1/?file=682035#file682035line1288
 
  Could a Session be shared accross threads?

Answered by Thejas' comments.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52655
---


On Sept. 8, 2014, 11:37 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 8, 2014, 11:37 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work. The patch 
 is large, but about 2/3 of that are tests.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
   data/conf/tez/hive-site.xml 0b3877c 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
  1a84024 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
  9807497 
   itests/src/test/resources/testconfiguration.properties 99049ca 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 f1697bb 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java 7fcbe3c 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9953919 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 4246d68 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java f018ca0 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java e3bc3b1 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 7f1d71b 
   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java b1c4441 
   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 913d3ac 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 264052f 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 8354ad9 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 32d2f7a 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2b1a345 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 4acafba 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
  96a5d78 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
  5c711cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 5195748 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 911ac8a 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 97fa52c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
 026efe8 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 2dbf1c8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6dce30c 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 5695f35 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 47fe508 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 789c780 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 63ecb8d 
   
 ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
  PRE-CREATION 
   ql/src/test/queries/clientnegative/acid_overwrite.q PRE-CREATION 
   ql/src/test/queries/clientnegative/delete_not_acid.q PRE-CREATION 
   ql/src/test/queries/clientnegative/update_not_acid.q PRE-CREATION 
   ql/src/test/queries/clientnegative/update_partition_col.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_all_non_partitioned.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_all_partitioned.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_orig_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_tmp_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_no_match.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_non_partitioned.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_partitioned.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_whole_partition.q PRE-CREATION 
   ql/src/test/queries/clientpositive/insert_orig_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/insert_update_delete.q PRE-CREATION 
   ql/src/test/queries/clientpositive/insert_values_dynamic_partitioned.q 
 PRE-CREATION 
   ql/src/test/queries

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-10 Thread Alan Gates



 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 305
  https://reviews.apache.org/r/25414/diff/1/?file=682007#file682007line305
 
  Pass another boolean param true to exclude it from the auto generated 
  hive-default.xml.template .
  See HIVE_IN_TEST constructor use.

Done.


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 930
  https://reviews.apache.org/r/25414/diff/2/?file=682947#file682947line930
 
  This error message is assuming that only getUser will throw 
  IOException, but there might be other code added in future that might 
  result in IOException being thrown, and it will be easy to forget to update 
  this error message.
  
  How about creating a separate try catch block for conf.getUser ?

try/catch moved to just surround conf.getUser


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2420
  https://reviews.apache.org/r/25414/diff/2/?file=682958#file682958line2420
 
  The logging here does not look necessary as an excetption is being 
  thrown. But if we do log, I think its better to also log the exception 
  here. LOG.error(msg,e);

logging removed.


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2439
  https://reviews.apache.org/r/25414/diff/2/?file=682958#file682958line2439
 
  logging the exception would be useful, in case it fails for a reason 
  other than dir already exists.

Exception added to log message.


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 2454
  https://reviews.apache.org/r/25414/diff/2/?file=682958#file682958line2454
 
  if we are logging, lets log the exception as well. (I don't see any 
  added value of this log, as top level exception log would show the whole 
  stack trace)

Logging removed.


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 686
  https://reviews.apache.org/r/25414/diff/2/?file=682964#file682964line686
 
  Use SemanticException instead of RTE here?

done


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 724
  https://reviews.apache.org/r/25414/diff/2/?file=682964#file682964line724
 
  Its safer to use {} for if-else.
  Remember the apple security issue ? 
  https://www.imperialviolet.org/2014/02/22/applebug.html

Which is why I never put it on a separate line.


 On Sept. 10, 2014, 12:11 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 735
  https://reviews.apache.org/r/25414/diff/2/?file=682964#file682964line735
 
  fillDefaultStorageFormat uses the value of 
  ConfVars.HIVEDEFAULTFILEFORMAT. If it is set to something other than 
  TextFile, this will break.
  
  Also, that function does not set serde at present.

Good points.  Is there a different function you suggest or should I just do the 
same work manually?


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52559
---


On Sept. 8, 2014, 11:37 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 8, 2014, 11:37 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work. The patch 
 is large, but about 2/3 of that are tests.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
   data/conf/tez/hive-site.xml 0b3877c 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
  1a84024 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
  9807497 
   itests/src/test/resources/testconfiguration.properties 99049ca 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 f1697bb 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java 7fcbe3c 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9953919 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 4246d68 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
   ql/src/java/org/apache/hadoop/hive/ql/exec

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-10 Thread Alan Gates



 On Sept. 10, 2014, 2:52 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java,
   line 74
  https://reviews.apache.org/r/25414/diff/2/?file=682966#file682966line74
 
  throw SemanticException  here ?

To me that's a RuntimeException.  We should never get to that point, it 
indicates an internal error.  I would think semantic exceptions are for user 
errors that make it past the parser.


 On Sept. 10, 2014, 2:52 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java,
   line 226
  https://reviews.apache.org/r/25414/diff/2/?file=682966#file682966line226
 
  I thought row__ids are stored in ascending order.
  Why is the sort in descending order ?

You're correct that row_ids are stored in ascending order.  For reasons I 
didn't investigate the results come out the opposite of whatever is requested 
here.  The issue isn't in RecordIdentifier because requesting ascending order 
in SortedDynPartitioner in the optimizer produces the correct results.


 On Sept. 10, 2014, 2:52 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java,
   line 260
  https://reviews.apache.org/r/25414/diff/2/?file=682966#file682966line260
 
  add this here ?
   //  - TOK_SORTBY

done.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52810
---


On Sept. 8, 2014, 11:37 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 8, 2014, 11:37 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work. The patch 
 is large, but about 2/3 of that are tests.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
   data/conf/tez/hive-site.xml 0b3877c 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
  1a84024 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
  9807497 
   itests/src/test/resources/testconfiguration.properties 99049ca 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 f1697bb 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java 7fcbe3c 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9953919 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 4246d68 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java f018ca0 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java e3bc3b1 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 7f1d71b 
   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java b1c4441 
   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 913d3ac 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 264052f 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 8354ad9 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 32d2f7a 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2b1a345 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 4acafba 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
  96a5d78 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
  5c711cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 5195748 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 911ac8a 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 97fa52c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
 026efe8 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 2dbf1c8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6dce30c 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 5695f35 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 47fe508 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 789c780 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 63ecb8d 
   
 ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
  PRE-CREATION 
   ql

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-10 Thread Alan Gates



 On Sept. 9, 2014, 9:33 p.m., Eugene Koifman wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 6108
  https://reviews.apache.org/r/25414/diff/1/?file=682029#file682029line6108
 
  Would ROWID.getTypeInfo() work?

Seems to.


 On Sept. 9, 2014, 9:33 p.m., Eugene Koifman wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 
  11872
  https://reviews.apache.org/r/25414/diff/1/?file=682029#file682029line11872
 
  does this work if of implements a sublcass of AcidOutputFormat?  
  Perhaps Class.isAssignableFrom() is a safer choice

Changed.


 On Sept. 9, 2014, 9:33 p.m., Eugene Koifman wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java,
   line 305
  https://reviews.apache.org/r/25414/diff/1/?file=682031#file682031line305
 
  Would the outputs.size() be in partitioned case since we dynamic 
  partition insert is used?
  Could this test be more obvious, like checking getTyp()==TABLE or 
  something like that?

Good catch.  The way I was doing this was definitely buggy.  I'll re-implement 
it to check explicitly for partitions in the inputs.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52655
---


On Sept. 8, 2014, 11:37 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 8, 2014, 11:37 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work. The patch 
 is large, but about 2/3 of that are tests.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
   data/conf/tez/hive-site.xml 0b3877c 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
  1a84024 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
  9807497 
   itests/src/test/resources/testconfiguration.properties 99049ca 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 f1697bb 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java 7fcbe3c 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9953919 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 4246d68 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java f018ca0 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java e3bc3b1 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 7f1d71b 
   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java b1c4441 
   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 913d3ac 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 264052f 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 8354ad9 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 32d2f7a 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2b1a345 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 4acafba 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
  96a5d78 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
  5c711cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 5195748 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 911ac8a 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 97fa52c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
 026efe8 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 2dbf1c8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6dce30c 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 5695f35 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 47fe508 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 789c780 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 63ecb8d 
   
 ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
  PRE-CREATION 
   ql/src/test/queries/clientnegative/acid_overwrite.q PRE-CREATION 
   ql/src/test/queries/clientnegative/delete_not_acid.q PRE-CREATION 
   ql/src/test/queries/clientnegative/update_not_acid.q PRE

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-11 Thread Alan Gates

-CREATION 
  ql/src/test/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-12 Thread Alan Gates



 On Sept. 9, 2014, 9:33 p.m., Eugene Koifman wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 
  11872
  https://reviews.apache.org/r/25414/diff/1/?file=682029#file682029line11872
 
  does this work if of implements a sublcass of AcidOutputFormat?  
  Perhaps Class.isAssignableFrom() is a safer choice
 
 Alan Gates wrote:
 Changed.

Actually, I had to back this out.  Making this change made it so that it said 
all output formats were acid.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25414/#review52655
---


On Sept. 11, 2014, 2:17 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25414/
 ---
 
 (Updated Sept. 11, 2014, 2:17 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
 Thejas Nair.
 
 
 Bugs: HIVE-7788
 https://issues.apache.org/jira/browse/HIVE-7788
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds plan generation as well as making modifications to some of 
 the exec operators to make insert/value, update, and delete work. The patch 
 is large, but about 2/3 of that are tests.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5d2e6b0 
   data/conf/tez/hive-site.xml 0b3877c 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
  1a84024 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
  9807497 
   itests/src/test/resources/testconfiguration.properties 99049ca 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 f1697bb 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java 7fcbe3c 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 9953919 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 4246d68 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java f018ca0 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java e3bc3b1 
   ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 7f1d71b 
   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java b1c4441 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 264052f 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 8354ad9 
   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 32d2f7a 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 2b1a345 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 4acafba 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java
  96a5d78 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java
  5c711cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 5195748 
   ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java 911ac8a 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 496f6a6 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
 3e3926e 
   ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java ad91b0f 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 2dbf1c8 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 6dce30c 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 5695f35 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 5164b16 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 789c780 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 63ecb8d 
   
 ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
  PRE-CREATION 
   ql/src/test/queries/clientnegative/acid_overwrite.q PRE-CREATION 
   ql/src/test/queries/clientnegative/delete_not_acid.q PRE-CREATION 
   ql/src/test/queries/clientnegative/update_not_acid.q PRE-CREATION 
   ql/src/test/queries/clientnegative/update_partition_col.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_all_non_partitioned.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_all_partitioned.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_orig_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_tmp_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_no_match.q PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_non_partitioned.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/delete_where_partitioned.q PRE-CREATION 
   ql/src/test/queries/clientpositive

Re: Review Request 25414: HIVE-7788 Generate plans for insert, update, and delete

2014-09-12 Thread Alan Gates

/results/clientpositive/update_where_partitioned.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25414/diff/


Testing
---

Many tests included in the patch, including insert/values, update, and delete 
all tested against: non-partitioned tables, partitioned tables, and temp tables.


Thanks,

Alan Gates

Broken build

2014-09-13 Thread Alan Gates

There are 7 tez qfile tests that are failing on every HiveQA run.  They 
fail for me when I run them on trunk.  I've filed: 
https://issues.apache.org/jira/browse/HIVE-8093 for them.


I'm guessing this is related to the recent checkin for HIVE-7704, since 
many of those tests were added in that commit.


Alan.
--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Review Request 25616: HIVE-7790 Update privileges to check for update and delete

2014-09-13 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25616/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-7790
https://issues.apache.org/jira/browse/HIVE-7790


Repository: hive-git


Description
---

Adds update and delete as action and adds checks for authorization during 
update and delete. Also adds passing of updated columns in case authorizer 
wishes to check them.


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
 53d88b0 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 298f429 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java b2f66e0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
3aaa09c 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
 93df9f4 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java
 093b4fd 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 3236341 
  ql/src/test/queries/clientnegative/authorization_delete_nodeletepriv.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_update_noupdatepriv.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_delete.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_delete_own_table.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_update.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_update_own_table.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_delete_nodeletepriv.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_update_noupdatepriv.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_delete.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_delete_own_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_update.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_update_own_table.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25616/diff/


Testing
---

Added tests, both positive and negative, for update and delete, including 
ability to update and delete tables created by user.  Also added tests for 
passing correct update columns.


Thanks,

Alan Gates

Re: Review Request 25616: HIVE-7790 Update privileges to check for update and delete

2014-09-14 Thread Alan Gates



 On Sept. 14, 2014, 7:13 a.m., Thejas Nair wrote:
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java,
   line 272
  https://reviews.apache.org/r/25616/diff/1/?file=688987#file688987line272
 
  It would be good to also verify the input columns being passed here.

But I don't put the input columns in the list.  You don't need read permissions 
to update, so I'm not adding these to a list to be checked.


 On Sept. 14, 2014, 7:13 a.m., Thejas Nair wrote:
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java,
   line 273
  https://reviews.apache.org/r/25616/diff/1/?file=688987#file688987line273
 
  A similar test for delete would also be useful, specially for testing 
  the input columns being passed.

Same as above on update, I'm not checking read permissions, so there's no list 
of input columns.


 On Sept. 14, 2014, 7:13 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 506
  https://reviews.apache.org/r/25616/diff/1/?file=688988#file688988line506
 
  Can you also change the variable name of tab2cols to indicate that it 
  is the table to input column mapping (since we have updateTab2Cols) ? maybe 
  selectTab2Cols or inputTab2Cols

Changed to selectTab2Cols


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25616/#review53277
---


On Sept. 14, 2014, 4:30 a.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25616/
 ---
 
 (Updated Sept. 14, 2014, 4:30 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-7790
 https://issues.apache.org/jira/browse/HIVE-7790
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Adds update and delete as action and adds checks for authorization during 
 update and delete. Also adds passing of updated columns in case authorizer 
 wishes to check them.
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
  53d88b0 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 298f429 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 b2f66e0 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 3aaa09c 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  93df9f4 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java
  093b4fd 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
  3236341 
   ql/src/test/queries/clientnegative/authorization_delete_nodeletepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_update_noupdatepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete_own_table.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update_own_table.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_delete_nodeletepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_update_noupdatepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete_own_table.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update_own_table.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25616/diff/
 
 
 Testing
 ---
 
 Added tests, both positive and negative, for update and delete, including 
 ability to update and delete tables created by user.  Also added tests for 
 passing correct update columns.
 
 
 Thanks,
 
 Alan Gates

Review Request 19149: Stand alone metastore fails to start if new transaction values not defined in config

2014-03-12 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19149/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-6606
https://issues.apache.org/jira/browse/HIVE-6606


Repository: hive-git


Description
---

The metastore creates instances of TxnHandler. The constructor of this class 
will fail if the config value for the jdbc string it expects is not defined in 
the config file.

Fixed this by changing transaction connection to use the same JDBC connection 
string as the rest of the metastore.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java edc3d38 
  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java 
bbb0d28 
  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
4441c2f 
  metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
560fd5a 

Diff: https://reviews.apache.org/r/19149/diff/


Testing
---

Ran unit tests plus ran on cluster to assure issue not seen when transaction 
handling turned off.


Thanks,

Alan Gates

Review Request 19161: Heartbeats are not being sent when DbLockMgr is used and an operation holds locks

2014-03-12 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19161/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-6635
https://issues.apache.org/jira/browse/HIVE-6635


Repository: hive-git


Description
---

Added a thread to Driver to send heartbeats. This thread only runs during the 
main loop in Driver.execute. I added this in a separate thread because 
otherwise I would have needed to add threads in every task to see if heartbeats 
needed to be sent. This would be very invasive, and also it's not clear it 
would be possible to cover all cases as there are actions that may simply take 
a long time (like certain metastore operations). The downside is that a query 
will keep running even after it's found out it's locks were aborted and only be 
terminated at the end.


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
4441c2f 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbLockManager.java 535912f 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 7773f66 

Diff: https://reviews.apache.org/r/19161/diff/


Testing
---

Ran unit tests specific to transaction operations, as well as manual system 
testing.


Thanks,

Alan Gates

Re: Review Request 19149: Stand alone metastore fails to start if new transaction values not defined in config

2014-03-12 Thread Alan Gates



 On March 12, 2014, 8:21 p.m., Ashutosh Chauhan wrote:
  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java, 
  line 205
  https://reviews.apache.org/r/19149/diff/1/?file=517656#file517656line205
 
  Do we need to synchronize this method?

This is really intended for use only in testing.  It's only in the src area 
rather than test so that it can be picked up cross package for things like 
streaming and hive client tests.  So I'm not too worried about synchronization 
or performance (for the next comment).  I can add comments on the methods to 
make this clear so no one uses it when they shouldn't.


 On March 12, 2014, 8:21 p.m., Ashutosh Chauhan wrote:
  metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java, 
  line 215
  https://reviews.apache.org/r/19149/diff/1/?file=517656#file517656line215
 
  You created prop object but didn't make use of it. Don't you want to 
  use that prop here, instead of new Properties?

Oops.  Will fix.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19149/#review36974
---


On March 12, 2014, 7:20 p.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19149/
 ---
 
 (Updated March 12, 2014, 7:20 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-6606
 https://issues.apache.org/jira/browse/HIVE-6606
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The metastore creates instances of TxnHandler. The constructor of this class 
 will fail if the config value for the jdbc string it expects is not defined 
 in the config file.
 
 Fixed this by changing transaction connection to use the same JDBC connection 
 string as the rest of the metastore.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java edc3d38 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java 
 bbb0d28 
   metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
 4441c2f 
   metastore/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
 560fd5a 
 
 Diff: https://reviews.apache.org/r/19149/diff/
 
 
 Testing
 ---
 
 Ran unit tests plus ran on cluster to assure issue not seen when transaction 
 handling turned off.
 
 
 Thanks,
 
 Alan Gates

Re: [VOTE] Apache Hive 0.13.0 Release Candidate 2

2014-04-18 Thread Alan Gates

+1 (non-binding)

Alan.
 
On Apr 17, 2014, at 7:15 PM, Thejas Nair the...@hortonworks.com wrote:

 +1
 - Verified the md5 checksums and gpg keys
 - Checked LICENSE, README.txt , NOTICE, RELEASE_NOTES.txt files
 - Build src tar.gz
 - Ran local mode queries with new build.
 
 I had run unit test suite with rc1 and they looked good.
 
 
 On Tue, Apr 15, 2014 at 2:06 PM, Harish Butani rhbut...@apache.org wrote:
 Apache Hive 0.13.0 Release Candidate 2 is available here:
 
 http://people.apache.org/~rhbutani/hive-0.13.0-candidate-2
 
 Maven artifacts are available here:
 
 https://repository.apache.org/content/repositories/orgapachehive-1011
 
 Source tag for RCN is at:
 https://svn.apache.org/repos/asf/hive/tags/release-0.13.0-rc2/
 
 Voting will conclude in 72 hours.
 
 Hive PMC Members: Please test and vote.
 
 Thanks.
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Modify/add component in Hive JIRA

2014-04-23 Thread Alan Gates

For the new transactions work that went in as part of HIVE-5317 I’ve been using 
“Locking” as the component for JIRAs.  This work is related to locking, but 
that label doesn’t really cover all the transaction management being done.  
Could we add a label for “Transaction Management” or modify the “Locking” label 
to be “Locking/Transaction Management”?

Alan.
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-03 Thread Alan Gates

One other benefit in rotating chairs is that it exposes more of Hive’s PMC 
members to the board and other Apache old timers.  This is helpful in getting 
better integrated into Apache and becoming a candidate for Apache membership.  
It is also an excellent education in the Apache Way for those who serve.

Alan.

On Dec 31, 2013, at 3:30 PM, Lefty Leverenz leftylever...@gmail.com wrote:

 Okay, I'm convinced that one-year terms for the chair are reasonable.
 Thanks for the reassurance, Edward and Thejas.
 
 Is 24h rule is needed at all? In other projects, I've seen patches simply
 reverted by author (or someone else). It's a rare occurrence, and it should
 be possible to revert a patch if someone -1s it after commit, esp. within
 the same 24 hours when not many other changes are in.
 
 
 Sergey makes a good point, but the 24h rule seems helpful in prioritizing
 tasks.  We're all deadline-driven, right?
 
 I'm the chief culprit of seeing patch available and ignoring it until it
 has been committed.  Then if I find some minor typo or doc issue, I'm
 embarrassed at posting a comment after the commit because nobody wants to
 revert a patch just for documentation.
 
 -- Lefty
 
 
 On Sun, Dec 29, 2013 at 12:06 PM, Thejas Nair the...@hortonworks.comwrote:
 
 On Sun, Dec 29, 2013 at 12:06 AM, Lefty Leverenz
 leftylever...@gmail.com wrote:
 Let's discuss annual rotation of the PMC chair a bit more.  Although I
 agree with the points made in favor, I wonder about frequent loss of
 expertise and needing to establish new relationships.  What's the ramp-up
 time?
 
 The ramp up time is not significant, as you can see from the list of
 responsibilities mentioned here -
 http://www.apache.org/dev/pmc.html#chair .
 We have enough people in PMC who have been involved with Apache
 project for long time and are familiar with apache bylaws and way of
 doing things. Also, the former PMC chairs are likely to be around to
 help as needed.
 
 Could a current chair be chosen for another consecutive term?  Could two
 chairs alternate years indefinitely?
 I would take the meaning of rotation to mean that we have a new chair
 for the next term. I think it should be OK to have same chair in
 alternative year. 2 years is a long time and it sounds reasonable
 given the size of the community ! :)
 
 Do many other projects have annual rotations?
 Yes, at least hadoop and pig project have that.  I could not find
 by-laws pages easily for other projects.
 
 
 Would it be inconvenient to change chairs in the middle of a release?
 No. The PMC Chair position does not have any special role in a release.
 
 And now to trivialize my comments:  while making other changes, let's fix
 this typo:  Membership of the PMC can be revoked by an unanimous vote
 ... *(should
 be a unanimous ... just like a university because the rule is based
 on
 sound, not spelling)*.
 
 I think you should feel free to fix such a typos in this wiki without
 a vote on it ! :)
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: How do you run single query test(s) after mavenization?

2014-01-03 Thread Alan Gates


 
 
 
 The rest of the ant instances are okay because the MVN section afterwards
 gives the alternative, but should we keep ant or make the replacements?
 
   - 9.  Now you can run the ant 'thriftif' target ...
   - 11.  ant thriftif -Dthrift.home=...
   - 15.  ant thriftif
   - 18. ant clean package
   - The maven equivalent of ant thriftif is:
 
 mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
 
 
 
 I have not generated the thrift stuff recently. It would be great if Alan
 or someone else who has would update this section.

I can take a look at this.  It works with pretty minimal changes.

Alan.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: How do you run single query test(s) after mavenization?

2014-01-03 Thread Alan Gates

Ok, I’ve updated it to just have the maven instructions, since I’m assuming no 
one cares about the ant ones anymore.

Alan.

On Jan 3, 2014, at 3:46 PM, Alan Gates ga...@hortonworks.com wrote:

 
 
 
 
 The rest of the ant instances are okay because the MVN section afterwards
 gives the alternative, but should we keep ant or make the replacements?
 
  - 9.  Now you can run the ant 'thriftif' target ...
  - 11.  ant thriftif -Dthrift.home=...
  - 15.  ant thriftif
  - 18. ant clean package
  - The maven equivalent of ant thriftif is:
 
 mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local
 
 
 
 I have not generated the thrift stuff recently. It would be great if Alan
 or someone else who has would update this section.
 
 I can take a look at this.  It works with pretty minimal changes.
 
 Alan.
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Parquet support (HIVE-5783)

2014-02-18 Thread Alan Gates

Gunther, is it the case that there is anything extra that needs to be done to 
ship Parquet code with Hive right now?  If I read the patch correctly the 
Parquet jars were added to the pom and thus will be shipped as part of Hive.  
As long as it works out of the box when a user says “create table … stored as 
parquet” why do we care whether the parquet jar is owned by Hive or another 
project?

The concern about feature mismatch in Parquet versus Hive is valid, but I’m not 
sure what to do about it other than assure that there are good error messages.  
Users will often want to use non-Hive based storage formats (Parquet, Avro, 
etc.).  This means we need a good way to detect at SQL compile time that the 
underlying storage doesn’t support the indicated data type and throw a good 
error.

Also, it’s important to be clear going forward about what Hive as a project is 
signing up for.  If tomorrow someone decides to add a new datatype or feature 
we need to be clear that we expect the contributor to make this work for Hive 
owned formats (text, RC, sequence, ORC) but not necessarily for external 
formats (Parquet, Avro).  

Alan.

On Feb 17, 2014, at 7:03 PM, Gunther Hagleitner ghagleit...@hortonworks.com 
wrote:

 Brock,
 
 I'm not trying to pick winners, I'm merely trying to say that the
 documentation/code should match what's actually there, so folks can make
 informed decisions.
 
 The issue I have with the word native is that people have expectations
 when they hear it and I think these are not met.
 
 I've had folks ask me why we're switching the default of hive to Parquet.
 This isn't the case obviously, but native to most people means just that:
 Hive's primary format. That's why I was asking for a title of Add Parquet
 SerDe for the jira. That's the exact same thing that was done for Avro
 under the exact same circumstances:
 https://issues.apache.org/jira/browse/HIVE-895.
 
 Native also has other associations a) it supports the full data
 model/feature set and b) it's part of hive. Neither is the case and I don't
 think that's just a superficial difference. Support and usability will be
 different. That's why I think the documentation should delineate between
 RC/ORC/etc on one side and Parquet/Avro/etc on the other.
 
 As mentioned in the jira STORED AS was reserved for what's actually part
 of hive (or hadoop core in the case of sequence file as you point out). I
 think there are reasons for that: a) being part of the grammar implies
 native as above b) you need to ship the code bundled in hive-exec for this
 to work (which is *broken* right now) and c) like you said we shouldn't
 pick winners by letting some of them become a keyword and others not. For
 these reasons I think Parquet should use the old syntax at this point. If
 you have a pluggable/configurable way great, but right now we don't have
 that.
 
 Finally, yes, I am late to this party and I apologize for that. I'm happy
 to make the suggested changes myself, if that's the concern.
 
 Thanks,
 Gunther.
 
 
 
 On Sun, Feb 16, 2014 at 7:40 PM, Brock Noland br...@cloudera.com wrote:
 
 Hi Gunther,
 
 Please find my response inline.
 
 On Sat, Feb 15, 2014 at 5:52 PM, Gunther Hagleitner gunt...@apache.org
 wrote:
 I read through the ticket, patch and documentation
 
 Thank you very much for reading through these items!
 
 and would like to
 suggest some changes.
 
 There was ample time to suggest these changes prior to commit. The
 JIRA was created three months ago, and the title you object to and the
 patch was up there over two months ago.
 
 As far as I can tell this basically adds parquet SerDes to hive, but the
 file format remains external to hive. There is no way for hive devs to
 makes changes, fix bugs add, change datatypes, add features to parquet
 itself.
 
 As stated in many locations including the JIRA discussed here, we
 shouldn't be picking winner/loser file formats. We use many external
 libraries, none of which, all Hive developers have the ability to
 modify. For example most Hive developers do not have the ability to
 modify Sequence File. Tez is also an external library which few Hive
 developers can change.
 
 So:
 
 - I suggest we document it as one of the built-in SerDes and not as a
 native format like here:
 https://cwiki.apache.org/confluence/display/Hive/Parquet (and here:
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual)
 - I vote for the jira to say Add parquet SerDes to Hive and not Native
 support
 
 The change provides the ability to create a parquet table with Hive,
 natively. Therefore I don't see the issue you have with the word
 native.
 
 - I think we should revert the change to the grammar to allow STORED AS
 PARQUET until we have a mechanism to do that for all SerDes, i.e.:
 someone
 picks up: HIVE-5976. (I also don't think this actually works properly
 unless we bundle parquet in hive-exec, which I don't think we want.)
 
 Again, you could have provided this feedback many moons ago.

Re: Timeline for the Hive 0.13 release?

2014-03-04 Thread Alan Gates

Sure.  I’d really like to get the work related to HIVE-5317 in 0.13.  HIVE-5843 
is patch available and hopefully can be checked in today.  There are several 
more that depend on that one and can’t be made patch available until then 
(HIVE-6060, HIVE-6319, HIVE-6460, and HIVE-5687).  I don’t want to hold up the 
branching, but are you ok with those going in after the branch?

Alan.

On Mar 3, 2014, at 7:53 PM, Harish Butani hbut...@hortonworks.com wrote:

 I plan to create the branch 5pm PST tomorrow.
 Ok with everybody?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 5:44 PM, Lefty Leverenz leftylever...@gmail.com wrote:
 
 That's appropriate -- let the Hive release march forth on March 4th.
 
 
 -- Lefty
 
 
 On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani 
 hbut...@hortonworks.comwrote:
 
 Ok,let’s set it for March 4th .
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote:
 
 Might as well make it March 4th or 5th. Otherwise folks will burn
 weekend time to get patches in.
 
 On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com
 wrote:
 Yes makes sense.
 How about we postpone the branching until 10am PST March 3rd, which is
 the following Monday.
 Don’t see a point of setting the branch time to a Friday evening.
 Do people agree?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:
 
 +1
 
 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com
 wrote:
 Can we wait for some few more days for the branching ? I have a few
 more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday
 next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas
 
 
 
 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com
 wrote:
 
 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
 pending
 tests.
 
 Brock
 
 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
 wrote:
 HIVE-6037 is for generating hive-default.template file from
 HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend further
 rebasing
 of it till next release (conflicts too frequently).
 
 
 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com
 :
 
 I'll try to catch up on the wikidocs backlog for 0.13.0 patches in
 time
 for
 the release.  It's a long and growing list, though, so no promises.
 
 Feel free to do your own documentation, or hand it off to a
 friendly
 in-house writer.
 
 -- Lefty, self-appointed Hive docs maven
 
 
 
 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
 the...@hortonworks.com
 wrote:
 
 Sounds good to me.
 
 
 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Hi,
 
 Its mid feb. Wanted to check if the community is ready to cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in
 only
 blocker/critical bug fixes.
 
 regards,
 Harish.
 
 
 On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
 
 Hi,
 
 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.
 
 Brock
 
 
 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties
 for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.
 
 regards,
 Harish.
 
 On Jan 17, 2014, at 7:05 PM, Thejas Nair 
 the...@hortonworks.com
 
 wrote:
 
 Yes, I think it is time to start planning for the next
 release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a
 phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
 patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.
 
 Thanks,
 Thejas
 
 
 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach 
 c...@apache.org
 
 wrote:
 I was wondering what people think about setting a tentative
 date
 for
 the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
 Hive
 should follow a time-based release model with new releases
 every
 four
 months. If we follow that schedule we're due for the next
 release
 in
 mid-February.
 
 Thoughts?
 
 Thanks.
 
 Carl
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
 entity
 to
 which it is addressed and may contain

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-16 Thread Alan Gates

So this isn’t a technical issue, just concern about the delays in the mailing 
list?  Why not just extend the voting period then, until say Monday?

Alan.

On May 15, 2014, at 3:17 PM, Sushanth Sowmyan khorg...@gmail.com wrote:

 Hi Folks,
 
 I'm canceling this vote and withdrawing the RC1 candidate for the
 following reasons:
 
 a) I've talked to a couple of other people who haven't seen my mail
 updates to this thread, and saw my initial vote mail a bit late too.
 b) There's at least one other person that has attempted to reply to
 this thread, and I don't see the replies yet.
 
 Thus, when the mailing list channel isn't reliably working, the
 ability for people to +1 or -1 is taken away, and this does not work.
 (We don't want a situation where 3 people go ahead and +1, and that
 arrives before today evening, thus making the release releasable,
 while someone else discovers a breaking issue that should stop it, but
 is not able to have their objection or -1 appear in time.)
 
 I'm open to suggestions on how to proceed with the voting process. We
 could wait out this week and hope the ASF mailing list issues are
 resolved, but if it takes too much longer than that, we also have the
 issue of delaying an important bugfix release.
 
 Thoughts?
 
 -Sushanth
 (3:15PM PDT, May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 11:46 AM, Sushanth Sowmyan khorg...@gmail.com wrote:
 The apache dev list seems to still be a little wonky, Prasanth mailed
 me saying he'd replied to this thread with the following content, that
 I don't see in this thread:
 
 Hi Sushanth
 
 https://issues.apache.org/jira/browse/HIVE-7067
 This bug is critical as it returns wrong results for min(), max(),
 join queries that uses date/timestamp columns from ORC table.
 The reason for this issue is, for these datatypes ORC returns java
 objects whereas for all other types ORC returns writables.
 When get() is performed on their corresponding object inspectors,
 writables return a new object where as java object returns reference.
 This will cause issue when any operator perform comparison on
 date/timestamp values (references will be overwritten with next
 values).
 More information is provided in the description of the jira.
 
 I think the severity of this bug is critical and should be included as
 part of 0.13.1. Can you please include this patch in RC2?”
 
 I think this meets the bar for criticality(actual bug in core feature,
 no workaround) and severity( incorrect results, effectively data
 corruption when used as source for other data), and I'm willing to
 spin an RC2 for this, but I would still like to follow the process I
 set up for jira inclusion though, to make sure I'm not being biased
 about this, so I would request two other +1s to champion this bug's
 inclusion into the release.
 
 Also, another thought here is whether it makes sense for us to try to
 have a VOTE with a 72 hour deadline when the mailing list still seems
 iffy and delaying mails by multiple hours. Any thoughts on how we
 should proceed? (In case this mail goes out much later than I send it
 out, I'm sending it out at 11:45AM PDT, Thu May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 10:06 AM, Sushanth Sowmyan khorg...@gmail.com 
 wrote:
 Eugene, do you know if these two failures happen on 0.13.0 as well?
 
 I would assume that TestHive_7 is an issue on 0.13.0 as well, given
 that the fix for it went into trunk. What is your sense for how
 important it is that we fix this? i.e., per my understanding, (a) It
 does not cause a crash or adversly affect the ability for webhcat to
 continue operating, and (b) It means that the feature does not work
 (at all, but in isolation), and that there is no work around for it.
 This means I treat it as critical(valid bug without workaround) but
 not severe(breaks product, affects other features from being used).
 Thus, I'm willing to include HIVE-6521 in an RC2 if we have 2 more
 committers +1 an inclusion request for this.
 
 As for TestHeartbeat_1, that's an interesting failure. Do you have
 logs on what commandline options
 org.apache.hive.hcatalog.templeton.LauncherDelegator sent along that
 caused it to break? Would that affect other job launches?
 
 
 On Tue, May 13, 2014 at 8:14 PM, Eugene Koifman
 ekoif...@hortonworks.com wrote:
 TestHive_7 is explained by https://issues.apache.org/jira/browse/HIVE-6521,
 which is in trunk but not 13.1
 
 
 On Tue, May 13, 2014 at 6:50 PM, Eugene Koifman 
 ekoif...@hortonworks.comwrote:
 
 I downloaded src tar, built it and ran webhcat e2e tests.
 I see 2 failures (which I don't see on trunk)
 
 TestHive_7 fails with
 got percentComplete map 100% reduce 0%,  expected  map 100% reduce 100%
 
 TestHeartbeat_1 fails to even launch the job.  This looks like the root
 cause
 
 ERROR | 13 May 2014 18:24:00,394 |
 org.apache.hive.hcatalog.templeton.CatchallExceptionMapper |
 java.lang.NullPointerException
at

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-17 Thread Alan Gates


On May 16, 2014, at 10:51 PM, Lefty Leverenz leftylever...@gmail.com wrote:

 Any thoughts on how we should proceed?
 
  1. Is the mail archive accurate now?  Perhaps it could be used for vote
  verification.
  2. What if we voted in comments on a JIRA ticket?  (Lately I'm checking
  comment order on JIRAs because my inbox receives messages out of order.)
No, it has to use mail as the primary medium I think.  But the archives are 
accurate.

Alan.

 
 The JIRA is connected to the mailing list, so it might comply with the
 vote-by-email rule.
 
 -- Lefty
 
 
 On Fri, May 16, 2014 at 2:53 PM, Alan Gates ga...@hortonworks.com wrote:
 
 So this isn’t a technical issue, just concern about the delays in the
 mailing list?  Why not just extend the voting period then, until say Monday?
 
 Alan.
 
 On May 15, 2014, at 3:17 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 
 Hi Folks,
 
 I'm canceling this vote and withdrawing the RC1 candidate for the
 following reasons:
 
 a) I've talked to a couple of other people who haven't seen my mail
 updates to this thread, and saw my initial vote mail a bit late too.
 b) There's at least one other person that has attempted to reply to
 this thread, and I don't see the replies yet.
 
 Thus, when the mailing list channel isn't reliably working, the
 ability for people to +1 or -1 is taken away, and this does not work.
 (We don't want a situation where 3 people go ahead and +1, and that
 arrives before today evening, thus making the release releasable,
 while someone else discovers a breaking issue that should stop it, but
 is not able to have their objection or -1 appear in time.)
 
 I'm open to suggestions on how to proceed with the voting process. We
 could wait out this week and hope the ASF mailing list issues are
 resolved, but if it takes too much longer than that, we also have the
 issue of delaying an important bugfix release.
 
 Thoughts?
 
 -Sushanth
 (3:15PM PDT, May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 11:46 AM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 The apache dev list seems to still be a little wonky, Prasanth mailed
 me saying he'd replied to this thread with the following content, that
 I don't see in this thread:
 
 Hi Sushanth
 
 https://issues.apache.org/jira/browse/HIVE-7067
 This bug is critical as it returns wrong results for min(), max(),
 join queries that uses date/timestamp columns from ORC table.
 The reason for this issue is, for these datatypes ORC returns java
 objects whereas for all other types ORC returns writables.
 When get() is performed on their corresponding object inspectors,
 writables return a new object where as java object returns reference.
 This will cause issue when any operator perform comparison on
 date/timestamp values (references will be overwritten with next
 values).
 More information is provided in the description of the jira.
 
 I think the severity of this bug is critical and should be included as
 part of 0.13.1. Can you please include this patch in RC2?”
 
 I think this meets the bar for criticality(actual bug in core feature,
 no workaround) and severity( incorrect results, effectively data
 corruption when used as source for other data), and I'm willing to
 spin an RC2 for this, but I would still like to follow the process I
 set up for jira inclusion though, to make sure I'm not being biased
 about this, so I would request two other +1s to champion this bug's
 inclusion into the release.
 
 Also, another thought here is whether it makes sense for us to try to
 have a VOTE with a 72 hour deadline when the mailing list still seems
 iffy and delaying mails by multiple hours. Any thoughts on how we
 should proceed? (In case this mail goes out much later than I send it
 out, I'm sending it out at 11:45AM PDT, Thu May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 10:06 AM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 Eugene, do you know if these two failures happen on 0.13.0 as well?
 
 I would assume that TestHive_7 is an issue on 0.13.0 as well, given
 that the fix for it went into trunk. What is your sense for how
 important it is that we fix this? i.e., per my understanding, (a) It
 does not cause a crash or adversly affect the ability for webhcat to
 continue operating, and (b) It means that the feature does not work
 (at all, but in isolation), and that there is no work around for it.
 This means I treat it as critical(valid bug without workaround) but
 not severe(breaks product, affects other features from being used).
 Thus, I'm willing to include HIVE-6521 in an RC2 if we have 2 more
 committers +1 an inclusion request for this.
 
 As for TestHeartbeat_1, that's an interesting failure. Do you have
 logs on what commandline options
 org.apache.hive.hcatalog.templeton.LauncherDelegator sent along that
 caused it to break? Would that affect other job launches?
 
 
 On Tue, May 13, 2014 at 8:14 PM, Eugene Koifman
 ekoif...@hortonworks.com wrote:
 TestHive_7 is explained by
 https://issues.apache.org/jira/browse

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-19 Thread Alan Gates

The vote by mail requirement is an Apache one, which trumps any Hive bylaws.  I 
really think Apache is going to frown on voting via JIRA.

Alan.

On May 17, 2014, at 9:15 PM, Lefty Leverenz leftylever...@gmail.com wrote:

 Hive 
 bylawshttps://cwiki.apache.org/confluence/display/Hive/Bylaws#Bylaws-Votingsay
 the mailing list is used for voting, but as I recall bylaws have some
 wiggle room.
 
 Decisions regarding the project are made by votes on the primary project
 development mailing list (u...@hive.apache.org u...@pig.apache.org).
 Where necessary, PMC voting may take place on the private Hive PMC mailing
 list. Votes are clearly indicated by subject line starting with [VOTE].
 Votes may contain multiple items for approval and these should be clearly
 separated. Voting is carried out by replying to the vote mail.
 
 
 (Hm, the text says primary project development mailing list but then
 user@hive is shown in parentheses -- is that a typo in the bylaws?)
 
 Would people be willing to vote simultaneously by mail and on a jira?  It's
 inconvenient but shouldn't be necessary after this release.
 
 -- Lefty
 
 
 On Sat, May 17, 2014 at 7:30 PM, Sushanth Sowmyan khorg...@gmail.comwrote:
 
 There is a technical issue as well now, as raised by Prashant. But
 there is also the issue that people aren't reliably able to
 respond/object/approve, and not knowing if/when it'll go through.
 
 I think I like Lefty's jira proposal - we could open out a jira for it
 and address votes there, I think I'll do that for RC2.
 
 On Fri, May 16, 2014 at 2:53 PM, Alan Gates ga...@hortonworks.com wrote:
 So this isn’t a technical issue, just concern about the delays in the
 mailing list?  Why not just extend the voting period then, until say Monday?
 
 Alan.
 
 On May 15, 2014, at 3:17 PM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 
 Hi Folks,
 
 I'm canceling this vote and withdrawing the RC1 candidate for the
 following reasons:
 
 a) I've talked to a couple of other people who haven't seen my mail
 updates to this thread, and saw my initial vote mail a bit late too.
 b) There's at least one other person that has attempted to reply to
 this thread, and I don't see the replies yet.
 
 Thus, when the mailing list channel isn't reliably working, the
 ability for people to +1 or -1 is taken away, and this does not work.
 (We don't want a situation where 3 people go ahead and +1, and that
 arrives before today evening, thus making the release releasable,
 while someone else discovers a breaking issue that should stop it, but
 is not able to have their objection or -1 appear in time.)
 
 I'm open to suggestions on how to proceed with the voting process. We
 could wait out this week and hope the ASF mailing list issues are
 resolved, but if it takes too much longer than that, we also have the
 issue of delaying an important bugfix release.
 
 Thoughts?
 
 -Sushanth
 (3:15PM PDT, May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 11:46 AM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 The apache dev list seems to still be a little wonky, Prasanth mailed
 me saying he'd replied to this thread with the following content, that
 I don't see in this thread:
 
 Hi Sushanth
 
 https://issues.apache.org/jira/browse/HIVE-7067
 This bug is critical as it returns wrong results for min(), max(),
 join queries that uses date/timestamp columns from ORC table.
 The reason for this issue is, for these datatypes ORC returns java
 objects whereas for all other types ORC returns writables.
 When get() is performed on their corresponding object inspectors,
 writables return a new object where as java object returns reference.
 This will cause issue when any operator perform comparison on
 date/timestamp values (references will be overwritten with next
 values).
 More information is provided in the description of the jira.
 
 I think the severity of this bug is critical and should be included as
 part of 0.13.1. Can you please include this patch in RC2?”
 
 I think this meets the bar for criticality(actual bug in core feature,
 no workaround) and severity( incorrect results, effectively data
 corruption when used as source for other data), and I'm willing to
 spin an RC2 for this, but I would still like to follow the process I
 set up for jira inclusion though, to make sure I'm not being biased
 about this, so I would request two other +1s to champion this bug's
 inclusion into the release.
 
 Also, another thought here is whether it makes sense for us to try to
 have a VOTE with a 72 hour deadline when the mailing list still seems
 iffy and delaying mails by multiple hours. Any thoughts on how we
 should proceed? (In case this mail goes out much later than I send it
 out, I'm sending it out at 11:45AM PDT, Thu May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 10:06 AM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 Eugene, do you know if these two failures happen on 0.13.0 as well?
 
 I would assume that TestHive_7 is an issue on 0.13.0 as well, given
 that the fix

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-19 Thread Alan Gates

Per the apache infra tweet stream mail delivery times should be back to normal 
as of today. I believe Sushanth decided to roll a new rc anyway. Once that is 
done we should be able to vote in the normal manner. 

Alan. 

Sent from my iPhone

On May 19, 2014, at 17:59, Lefty Leverenz leftylever...@gmail.com wrote:

 Gotta side with Alan about voting by JIRA:  although it's convenient for
 the moment, the indirect mail records wouldn't be labeled [VOTE].  Some
 future Hive historian could come to grief (and have to drop out of grad
 school).
 
 The community needs to see the votes as they come in, and shouldn't have to
 look in the archives or JIRA comments.  What to do?  When can we expect the
 mailing list to get back to normal?
 
 
 -- Lefty
 
 
 On Mon, May 19, 2014 at 8:00 AM, Alan Gates ga...@hortonworks.com wrote:
 
 The vote by mail requirement is an Apache one, which trumps any Hive
 bylaws.  I really think Apache is going to frown on voting via JIRA.
 
 Alan.
 
 On May 17, 2014, at 9:15 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
 Hive bylaws
 https://cwiki.apache.org/confluence/display/Hive/Bylaws#Bylaws-Votingsay
 the mailing list is used for voting, but as I recall bylaws have some
 wiggle room.
 
 Decisions regarding the project are made by votes on the primary project
 development mailing list (u...@hive.apache.org u...@pig.apache.org).
 Where necessary, PMC voting may take place on the private Hive PMC
 mailing
 list. Votes are clearly indicated by subject line starting with [VOTE].
 Votes may contain multiple items for approval and these should be
 clearly
 separated. Voting is carried out by replying to the vote mail.
 
 
 (Hm, the text says primary project development mailing list but then
 user@hive is shown in parentheses -- is that a typo in the bylaws?)
 
 Would people be willing to vote simultaneously by mail and on a jira?
 It's
 inconvenient but shouldn't be necessary after this release.
 
 -- Lefty
 
 
 On Sat, May 17, 2014 at 7:30 PM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 
 There is a technical issue as well now, as raised by Prashant. But
 there is also the issue that people aren't reliably able to
 respond/object/approve, and not knowing if/when it'll go through.
 
 I think I like Lefty's jira proposal - we could open out a jira for it
 and address votes there, I think I'll do that for RC2.
 
 On Fri, May 16, 2014 at 2:53 PM, Alan Gates ga...@hortonworks.com
 wrote:
 So this isn’t a technical issue, just concern about the delays in the
 mailing list?  Why not just extend the voting period then, until say
 Monday?
 
 Alan.
 
 On May 15, 2014, at 3:17 PM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 
 Hi Folks,
 
 I'm canceling this vote and withdrawing the RC1 candidate for the
 following reasons:
 
 a) I've talked to a couple of other people who haven't seen my mail
 updates to this thread, and saw my initial vote mail a bit late too.
 b) There's at least one other person that has attempted to reply to
 this thread, and I don't see the replies yet.
 
 Thus, when the mailing list channel isn't reliably working, the
 ability for people to +1 or -1 is taken away, and this does not work.
 (We don't want a situation where 3 people go ahead and +1, and that
 arrives before today evening, thus making the release releasable,
 while someone else discovers a breaking issue that should stop it, but
 is not able to have their objection or -1 appear in time.)
 
 I'm open to suggestions on how to proceed with the voting process. We
 could wait out this week and hope the ASF mailing list issues are
 resolved, but if it takes too much longer than that, we also have the
 issue of delaying an important bugfix release.
 
 Thoughts?
 
 -Sushanth
 (3:15PM PDT, May 15 2014)
 
 
 
 On Thu, May 15, 2014 at 11:46 AM, Sushanth Sowmyan 
 khorg...@gmail.com
 wrote:
 The apache dev list seems to still be a little wonky, Prasanth mailed
 me saying he'd replied to this thread with the following content,
 that
 I don't see in this thread:
 
 Hi Sushanth
 
 https://issues.apache.org/jira/browse/HIVE-7067
 This bug is critical as it returns wrong results for min(), max(),
 join queries that uses date/timestamp columns from ORC table.
 The reason for this issue is, for these datatypes ORC returns java
 objects whereas for all other types ORC returns writables.
 When get() is performed on their corresponding object inspectors,
 writables return a new object where as java object returns reference.
 This will cause issue when any operator perform comparison on
 date/timestamp values (references will be overwritten with next
 values).
 More information is provided in the description of the jira.
 
 I think the severity of this bug is critical and should be included
 as
 part of 0.13.1. Can you please include this patch in RC2?”
 
 I think this meets the bar for criticality(actual bug in core
 feature,
 no workaround) and severity( incorrect results, effectively data
 corruption when used as source

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 2

2014-05-27 Thread Alan Gates

+1 (non-binding) - Built it, checked the signature and md5, and ran some basic 
tests.

Alan.

On May 23, 2014, at 1:45 AM, Sushanth Sowmyan khorg...@apache.org wrote:

 
 Apache Hive 0.13.1 Release Candidate 2 is available here:
 
 http://people.apache.org/~khorgath/releases/0.13.1_RC2/
 
 Maven artifacts are available here:
 
 https://repository.apache.org/content/repositories/orgapachehive-1014
 
 Source tag for RC2 is at : 
 https://svn.apache.org/viewvc/hive/tags/release-0.13.1-rc2/
 
 Hive PMC Members: Please test and vote.
 
 Thanks,
 -Sushanth


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive

2014-06-30 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22996/#review47005
---


What will happen if a user tries to create a view over a temp table?  I'm not 
sure if the creation will fail (since there's no table in the database) or 
succeed but fail later when the user tries to use the view.  Ideally it would 
give a nice error message, e.g. views not supported over temp tables.


ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
https://reviews.apache.org/r/22996/#comment82599

AFAICT there are no security checks here on who can create tables in what 
database.  Not sure how we should handle this, as you'd like users to be able 
to create temp tables even when they don't have a database they own and thus 
can create temp tables in.  But explicitly creating them in databases they 
don't have permission on is going to look like a security breach.



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
https://reviews.apache.org/r/22996/#comment82600

Same as comment above on being able to create a temp table on any db.  This 
allows moving a temp table into any db.


- Alan Gates


On June 28, 2014, 12:35 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/22996/
 ---
 
 (Updated June 28, 2014, 12:35 a.m.)
 
 
 Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani.
 
 
 Bugs: HIVE-7090
 https://issues.apache.org/jira/browse/HIVE-7090
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Temp tables managed in memory by SessionState.
 SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore 
 to access the temp tables saved in the SessionState when appropriate.
 
 
 Diffs
 -
 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 
 9fb7550 
   itests/qtest/testconfiguration.properties 1462ecd 
   metastore/if/hive_metastore.thrift cc802c6 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 
   ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
 1270520 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
 71471f4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 
   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 
   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c 
   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 
   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 
   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
 91de8da 
   
 ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java
  20d08b3 
   ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION 
   ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/temp_table_index.q PRE-CREATION 
   ql/src/test/queries/clientnegative/temp_table_partitions.q PRE-CREATION 
   ql/src/test/queries/clientnegative/temp_table_rename.q PRE-CREATION 
   ql/src/test/queries/clientpositive/show_create_table_temp_table.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/stats19.q 51514bd 
   ql/src/test/queries/clientpositive/temp_table.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_external.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_gb1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_join1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_names.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_options1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_precedence.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_subquery1.q PRE-CREATION 
   ql/src/test/queries/clientpositive/temp_table_windowing_expressions.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/temp_table_authorize_create_tbl.q.out 
 PRE-CREATION 
   ql/src/test/results

Version specific Hive docs

2012-09-26 Thread Alan Gates

Recently a JIRA was opened in Hive to move some of the Hive documents from wiki
to version control ( https://issues.apache.org/jira/browse/HIVE-3039 ). Edward
commented on the JIRA:

This issue was opened and died a long time ago. After I moved the
documentation from the wiki to the source everyone refused to update it and
it was a dead issue. I would advice not going down this very frustrating
path again.
...
I marked pages on the wiki that I moved as 'DO NOT EDIT THIS PAGE THIS IS NOW
IN XDOCS'. People ran into this and complained they explicitly did not want to
use in line documentation. You can find the history on the mailing list.

The value of version specific documentation is very clear to me (and apparently
to others since I cannot think of a single large project that does not do it),
so I am trying to figure out if Hive is really opposed to it or just wants to
keep the wikis open. I've been going over the old JIRAs and mailing list
archives as Edward suggested to understand what the decision at the time was.
Here's what I have found. The initial approach was covered in the notes from
the July 2010 Hive contributor meetup (
https://cwiki.apache.org/confluence/display/Hive/Development+ContributorsMeetings+HiveContributorsMinutes100706
):

• There was a discussion about the plan to move the documentation off of the
wiki and into version control.
• Several people voiced concerns that developers/users are less likely to
update the documentation if doing so requires them to submit a patch.
• The new proposal for documentation reached at the meeting is as follows:
• The trunk version of the documentation will be maintained on the wiki.
• As part of the release process the documentation will be copied off of the
wiki and converted to xdoc, and then checked into svn.
• HTML documentation generated from the xdoc will be posted to the Hive webpage
when the new release is posted.
• Carl is going to investigate the feasibility of writing a tool that converts
documentation directly from !MoinMoin wiki markup to xdoc.

There was some email discussion generated by these notes which did not change
the general view:
http://mail-archives.apache.org/mod_mbox/hive-dev/201007.mbox/%3CAANLkTin3EWUKWj65ZzDApGlrEEWYyg9sVrgGzDaFWO7T%40mail.gmail.com%3E

In a later email thread Joydeep rather forcefully argued that the wiki pages
should be left open even though there are xdocs:
http://mail-archives.apache.org/mod_mbox/hive-dev/201008.mbox/ajax/%3CB4F4475C5A97594A87B283C91F9E873A017FB35F%40sc-mbx05.TheFacebook.com%3E

Then in August it appears it was discussed again with the outcome that docs
should still be kept in source control (
https://cwiki.apache.org/confluence/display/Hive/Development+ContributorsMeetings+HiveContributorsMinutes100808
):

Discussed moving the documentation from the wiki to version control.
• Probably not practical to maintain the trunk version of the docs on the wiki
and roll over to version control at release time, so trunk version of docs will
be maintained in vcs.
• It was agreed that feature patches should include updates to the docs, but it
is also acceptable to file a doc ticket if there is time pressure to commit.j
• Will maintain an errata page on the wiki for collecting updates/corrections
from users. These notes will be rolled into the documentation in vcs on a
monthly basis.

Also relevant are the two older JIRA issues covering the abortive move:
https://issues.apache.org/jira/browse/HIVE-1135
https://issues.apache.org/jira/browse/HIVE-1446

It appears to me that there was lack of clarity about how to move information
from the wiki to the version controlled doc. There was also opposition
expressed to locking off the wiki. As far as I can tell no one was opposed to
version control of the docs per se.

So, I propose we let this, and similar patches that propose version specific
docs in version control, go forward. There's no need to close off the wiki.
There will be a tax on developers of features to add docs if they want users to
know about them. But this seems to me a good thing rather than a bad thing.

Thoughts?

Alan.

[DISCUSS] HCatalog becoming a subproject of Hive

2012-11-02 Thread Alan Gates

Hello Hive community.  It is time for HCatalog to graduate from the Apache 
Incubator.  Given the heavy dependence of HCatalog on Hive the HCatalog 
community agreed it made sense to explore graduating from the Incubator to 
become a subproject of Hive (see 
http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201209.mbox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E
 and 
http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201210.mbox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gmail.com%3E
 ).  To help both communities understand what HCatalog is and hopes to become 
we also developed a roadmap that summarizes HCatalog's current features, 
planned features, and other possible features under discussion:  
https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap

So we are now approaching you to see if there is agreement in the Hive 
community that HCatalog graduating into Hive would make sense.

Alan.

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-11-06 Thread Alan Gates


On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:

 I like the idea of Hcatalog becoming a Hive sub-project. The
 enhancements/bugs in the serde/metastore areas can indirectly
 benefit the hive community, and it will be easier for the fix to be in one
 place. Having said that, I don't see serde/metastore
 moving out of hive into a separate component. Things are tied too closely
 together. I am assuming that no new committers would
 be automatically added to Hive as part of this, and both Hive and HCatalog
 will continue to have its own committers.

One thing in this we'd like to discuss is the HCatalog committers having commit 
access to the metastore sections of Hive code.  That doesn't mean it has to 
move into HCatalog's code base.  But more and more the fixes and changes we're 
doing in HCatalog are really in Hive's metastore.  So we believe it would make 
sense to give HCat committers access to that component as well as HCat.

Alan.

 
 Thanks,
 -namit
 
 
 On 11/3/12 2:22 AM, Alan Gates ga...@hortonworks.com wrote:
 
 Hello Hive community.  It is time for HCatalog to graduate from the
 Apache Incubator.  Given the heavy dependence of HCatalog on Hive the
 HCatalog community agreed it made sense to explore graduating from the
 Incubator to become a subproject of Hive (see
 http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201209.mb
 ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E and
 http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201210.mb
 ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gmail.c
 om%3E ).  To help both communities understand what HCatalog is and hopes
 to become we also developed a roadmap that summarizes HCatalog's current
 features, planned features, and other possible features under discussion:
 https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap
 
 So we are now approaching you to see if there is agreement in the Hive
 community that HCatalog graduating into Hive would make sense.
 
 Alan.

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-11-12 Thread Alan Gates

I would suggest looking over the patch history of HCat committers.  I think 
most of them have already contributed a number of patches to the metastore.  
All are certainly aware of how to run Hive unit tests and have an understanding 
of how Hive works.  So I don't think it's fair to say they would be unsafe with 
access to the metastore.  And the Hive PMC is there to assure this does not 
happen.  If there are issues I am sure they can deal with them.

Alan.


On Nov 6, 2012, at 8:06 PM, Namit Jain wrote:

 Alan, that would not be a good idea. Metastore code is part of hive code,
 and it
 would be safer if only Hive committers had commit access to that.
 
 
 On 11/6/12 11:25 PM, Alan Gates ga...@hortonworks.com wrote:
 
 
 On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:
 
 I like the idea of Hcatalog becoming a Hive sub-project. The
 enhancements/bugs in the serde/metastore areas can indirectly
 benefit the hive community, and it will be easier for the fix to be in
 one
 place. Having said that, I don't see serde/metastore
 moving out of hive into a separate component. Things are tied too
 closely
 together. I am assuming that no new committers would
 be automatically added to Hive as part of this, and both Hive and
 HCatalog
 will continue to have its own committers.
 
 One thing in this we'd like to discuss is the HCatalog committers having
 commit access to the metastore sections of Hive code.  That doesn't mean
 it has to move into HCatalog's code base.  But more and more the fixes
 and changes we're doing in HCatalog are really in Hive's metastore.  So
 we believe it would make sense to give HCat committers access to that
 component as well as HCat.
 
 Alan.
 
 
 Thanks,
 -namit
 
 
 On 11/3/12 2:22 AM, Alan Gates ga...@hortonworks.com wrote:
 
 Hello Hive community.  It is time for HCatalog to graduate from the
 Apache Incubator.  Given the heavy dependence of HCatalog on Hive the
 HCatalog community agreed it made sense to explore graduating from the
 Incubator to become a subproject of Hive (see
 
 http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201209.
 mb
 ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E and
 
 http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201210.
 mb
 
 ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gmail
 .c
 om%3E ).  To help both communities understand what HCatalog is and
 hopes
 to become we also developed a roadmap that summarizes HCatalog's
 current
 features, planned features, and other possible features under
 discussion:
 https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap
 
 So we are now approaching you to see if there is agreement in the Hive
 community that HCatalog graduating into Hive would make sense.
 
 Alan.

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-12-03 Thread Alan Gates

I am not sure where we are on this discussion. So far those who have chimed in
seemed generally positive (Namit, Edward, Clark, Alexander). Namit and I have
different visions for what the committership might look like, so I'd like to
hear from other Hive PMC members what their view is on this. I have to say
from an HCatalog perspective the proposition is much less attractive without
some commit rights.

On a related note, people should be aware of these threads in the Incubator
list:
http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w%40mail.gmail.com%3E

http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ%40mail.gmail.com%3E

For those not inclined to read all the mails in the threads I will summarize
(though I urge all PMC members of Hive and PPMC members of HCat to read both
mail threads because this is highly relevant to what we are discussing). There
are two salient points in these threads:

1) It is not wise to build a subproject that is distinct from the main project
in the sense that it has separate community members interested in it.
Bertrand, Arun, Chris Mattman, and Greg Stein all spoke against this, and all
are long time Apache contributors with a lot of experience. They were all of
the opinion that it was reasonable for one project to release separate products.

2) It is not wise to have committers that have access to parts of a project but
not others. Greg and Bertrand argued (and Arun seemed to imply) that splitting
up committer lists by sections of the code did not work out well.

These insights cause me to question what we mean by subproject. I had
originally envisioned something that looked like Pig and Hive did when they
were subprojects of Hadoop. But this violates both 1 and 2 above. Given this
input from many of the wise old timers of Apache I think we should consider
what we mean when we say subproject and how tightly we are willing to integrate
these projects. Personally I think it makes sense to continue to pursue
integration, as I think HCat is really a set of interfaces on top of Hive and
it makes sense to coalesce those into one project. I guess this would mean
HCat becomes just another set of jars that Hive releases when it releases,
rather than a stand alone entity. But I'm curious to hear what others think.

Alan.

On Nov 14, 2012, at 10:22 PM, Namit Jain wrote:

The same criteria should be applied to all Hive committers. Only a
committer should be able to commit code.
I don¹t think we should bend this rule. Metastore is not a separate
project, but a integral part of hive.

-namit

On 11/12/12 10:32 PM, Alan Gates ga...@hortonworks.com wrote:

I would suggest looking over the patch history of HCat committers. I
think most of them have already contributed a number of patches to the
metastore. All are certainly aware of how to run Hive unit tests and
have an understanding of how Hive works. So I don't think it's fair to
say they would be unsafe with access to the metastore. And the Hive PMC
is there to assure this does not happen. If there are issues I am sure
they can deal with them.

Alan.

On Nov 6, 2012, at 8:06 PM, Namit Jain wrote:

Alan, that would not be a good idea. Metastore code is part of hive
code,
and it
would be safer if only Hive committers had commit access to that.

On 11/6/12 11:25 PM, Alan Gates ga...@hortonworks.com wrote:

On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:

I like the idea of Hcatalog becoming a Hive sub-project. The
enhancements/bugs in the serde/metastore areas can indirectly
benefit the hive community, and it will be easier for the fix to be in
one
place. Having said that, I don't see serde/metastore
moving out of hive into a separate component. Things are tied too
closely
together. I am assuming that no new committers would
be automatically added to Hive as part of this, and both Hive and
HCatalog
will continue to have its own committers.

One thing in this we'd like to discuss is the HCatalog committers
having
commit access to the metastore sections of Hive code. That doesn't
mean
it has to move into HCatalog's code base. But more and more the fixes
and changes we're doing in HCatalog are really in Hive's metastore. So
we believe it would make sense to give HCat committers access to that
component as well as HCat.

Alan.

Thanks,
-namit

On 11/3/12 2:22 AM, Alan Gates ga...@hortonworks.com wrote:

Hello Hive community. It is time for HCatalog to graduate from the
Apache Incubator. Given the heavy dependence of HCatalog on Hive the
HCatalog community agreed it made sense to explore graduating from
the
Incubator to become a subproject of Hive (see

http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20120
9.
mb
ox/%3C08C40723-8D4D-48EB-942B

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-12-18 Thread Alan Gates

 are only based on what hive needs. Which I believe is the wrong way to
 look
 at this situation.
 
 I though to reply to this thread because I have been following this
 Jira:
 https://issues.apache.org/jira/browse/HIVE-3752
 
 On a high level I do not like this duplication of effort and code. If
 hive
 is compatible with hcatalog I do not see why we put off merging the two
 at
 all. Hive users would get an immediate benefit if Hive used hcatalog
 with
 no apparent downside. Meanwhile we are putting this off and staying in
 this
 awkward transition phase.
 
 Personally, I do not have a problem being a hive committer and not
 having
 hcatalog commit. None of the hive work I have done has ever touched the
 metastore. Also of the thousands of jiras and features we have added
 only a
 small portion require metastore changes.
 
 As long as a couple active users have commit on hive and the suggested
 hcatalog subproject I do not think not having commit will be a
 roadblock in
 moving hive forward.
 
 
 On Mon, Dec 3, 2012 at 6:22 PM, Alan Gates ga...@hortonworks.com
 wrote:
 
 I am not sure where we are on this discussion.  So far those who have
 chimed in seemed generally positive (Namit, Edward, Clark, Alexander).
 Namit and I have different visions for what the committership might
 look
 like, so I'd like to hear from other Hive PMC members what their view
 is on
 this.  I have to say from an HCatalog perspective the proposition is
 much
 less attractive without some commit rights.
 
 On a related note, people should be aware of these threads in the
 Incubator list:
 
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%
 3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w%
 40mail.gmail.com
 %3E
 
 
 
 
 http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%
 3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ%40mail.gmail.com
 %3
 E
 
 For those not inclined to read all the mails in the threads I will
 summarize (though I urge all PMC members of Hive and PPMC members of
 HCat
 to read both mail threads because this is highly relevant to what we
 are
 discussing).  There are two salient points in these threads:
 
 1) It is not wise to build a subproject that is distinct from the main
 project in the sense that it has separate community members interested
 in
 it.  Bertrand, Arun, Chris Mattman, and Greg Stein all spoke against
 this,
 and all are long time Apache contributors with a lot of experience.
 They
 were all of the opinion that it was reasonable for one project to
 release
 separate products.
 
 2) It is not wise to have committers that have access to parts of a
 project but not others.  Greg and Bertrand argued (and Arun seemed to
 imply) that splitting up committer lists by sections of the code did
 not
 work out well.
 
 These insights cause me to question what we mean by subproject.  I had
 originally envisioned something that looked like Pig and Hive did when
 they
 were subprojects of Hadoop.  But this violates both 1 and 2 above.
 Given
 this input from many of the wise old timers of Apache I think we
 should
 consider what we mean when we say subproject and how tightly we are
 willing
 to integrate these projects.  Personally I think it makes sense to
 continue
 to pursue integration, as I think HCat is really a set of interfaces
 on top
 of Hive and it makes sense to coalesce those into one project.  I guess
 this would mean HCat becomes just another set of jars that Hive
 releases
 when it releases, rather than a stand alone entity.  But I'm curious to
 hear what others think.
 
 Alan.
 
 On Nov 14, 2012, at 10:22 PM, Namit Jain wrote:
 
 The same criteria should be applied to all Hive committers. Only a
 committer should be able to commit code.
 I don¹t think we should bend this rule. Metastore is not a separate
 project, but a integral part of hive.
 
 -namit
 
 
 On 11/12/12 10:32 PM, Alan Gates ga...@hortonworks.com wrote:
 
 I would suggest looking over the patch history of HCat committers.
 I
 think most of them have already contributed a number of patches to
 the
 metastore.  All are certainly aware of how to run Hive unit tests
 and
 have an understanding of how Hive works.  So I don't think it's
 fair to
 say they would be unsafe with access to the metastore.  And the
 Hive PMC
 is there to assure this does not happen.  If there are issues I am
 sure
 they can deal with them.
 
 Alan.
 
 
 On Nov 6, 2012, at 8:06 PM, Namit Jain wrote:
 
 Alan, that would not be a good idea. Metastore code is part of hive
 code,
 and it
 would be safer if only Hive committers had commit access to that.
 
 
 On 11/6/12 11:25 PM, Alan Gates ga...@hortonworks.com wrote:
 
 
 On Nov 4, 2012, at 8:35 PM, Namit Jain wrote:
 
 I like the idea of Hcatalog becoming a Hive sub-project. The
 enhancements/bugs in the serde/metastore areas can indirectly
 benefit the hive community, and it will be easier for the fix to
 be
 in
 one
 place. Having said that, I don't see

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-12-20 Thread Alan Gates

Namit,

I was not proposing that promotion to full committership would be automatic.  I 
assume it would still be done via a vote by the PMC.  I agree that we cannot 
_guarantee_ committership for HCat committers in 6-9 months.  But I am trying 
to lay out a clear path they can follow.  If they don't follow the path then 
they won't be committers.  I am also trying to make it non-preferential in that 
I am setting the criteria to be what I believe the Hive PMC would expect any 
prospective Hive committer to do.  The only intended preferential part of the 
proposal is the Hive shepherds, which we have all agreed is a good idea.

Alan.

On Dec 19, 2012, at 8:23 PM, Namit Jain wrote:

 I don’t agree with the proposal. It is impractical to have a Hcat committer
 with commit access to Hcat only portions of Hive. We cannot guarantee that
 a Hcat
 committer will become a Hive committer in 6-9 months, that depends on what
 they do
 in the next 6-9 months.
 
 The current Hcat committers should spend more time in reviewing patches,
 work on non-Hcat areas in Hive, and then gradually become a hive
 committer. They should not be given any preferential treatment, and the
 process should be same as it would be for any other hive contributor
 currently. Given that the expertise of the Hcat committers, they should
 be inline for becoming a hive committer if they continue to work in hive,
 but that cannot be guaranteed. I agree that some Hive committers should try
 and help the existing Hcat patches, and again that is voluntary and
 different
 committers cannot be assigned to different parts of the code.
 
 Thanks,
 -namit
 
 
 
 
 
 
 
 On 12/20/12 1:03 AM, Carl Steinbach cwsteinb...@gmail.com wrote:
 
 Alan's proposal sounds like a good idea to me.
 
 +1
 
 On Dec 18, 2012 5:36 PM, Travis Crawford traviscrawf...@gmail.com
 wrote:
 
 Alan, I think your proposal sounds great.
 
 --travis
 
 On Tue, Dec 18, 2012 at 1:13 PM, Alan Gates ga...@hortonworks.com
 wrote:
 Carl, speaking just for myself and not as a representative of the HCat
 PPMC at this point, I am coming to agree with you that HCat integrating
 with Hive fully makes more sense.
 
 However, this makes the committer question even thornier.  Travis and
 Namit, I think the shepherd proposal needs to lay out a clear and time
 bounded path to committership for HCat committers.  Having HCat
 committers
 as second class Hive citizens for the long run will not be healthy.  I
 propose the following as a starting point for discussion:
 
 All active HCat committers (those who have contributed or committed a
 patch in the last 6 months) will be made committers in the HCat portion
 only of Hive.  In addition those committers will be assigned a
 particular
 shepherd who is a current Hive committer and who will be responsible for
 mentoring them towards full Hive committership.  As a part of this
 mentorship the HCat committer will review patches of other contributors,
 contribute patches to Hive (both inside and outside of HCatalog),
 respond
 to user issues on the mailing lists, etc.  It is intended that as a
 result
 of this mentorship program HCat committers can become full Hive
 committers
 in 6-9 months.  No new HCat only committers will be elected in Hive
 after
 this.  All Hive committers will automatically also have commit rights on
 HCatalog.
 
 Alan.
 
 On Dec 14, 2012, at 10:05 AM, Carl Steinbach wrote:
 
 On a functional level I don't think there is going to be much of a
 difference between the subproject option proposed by Travis and the
 other
 option where HCatalog becomes a TLP. In both cases HCatalog and Hive
 will
 have separate committers, separate code repositories, separate
 release
 cycles, and separate project roadmaps. Aside from ASF bureaucracy, I
 think
 the only major difference between the two options is that the
 subproject
 route will give the rest of the community the false impression that
 the
 two
 projects have coordinated roadmaps and a process to prevent
 overlapping
 functionality from appearing in both projects. Consequently, If these
 are
 the only two options then I would prefer that HCatalog become a TLP.
 
 On the other hand, I also agree with many of the sentiments that have
 already been expressed in this thread, namely that the two projects
 are
 closely related and that it would benefit the community at large if
 the
 two
 projects could be brought closer together. Up to this point the major
 source of pain for the HCatalog team has been the frequent necessity
 of
 making changes on both the Hive and HCatalog sides when implementing
 new
 features in HCatalog. This situation is compounded by the ASF
 requirement
 that release artifacts may not depend on snapshot artifacts from
 other
 ASF
 projects. Furthermore, if Hive adds a dependency on HCatalog then it
 will
 be subject to these same problems (in addition to the gross circular
 dependency!).
 
 I think the best way to avoid these problems is for HCatalog to
 become a
 Hive

Re: [VOTE] Apache Hive 0.10.0 Release Candidate 0

2012-12-21 Thread Alan Gates

+1 (non-binding)
Checked the check sums and key signatures.  Installed it and ran a few queries. 
 All looked good.

As a note Hive should be offering a src only release and a convenience binary 
rather than two binaries, one with the source and one without.  See the thread 
on general@incubator discussing this:  
http://mail-archives.apache.org/mod_mbox/incubator-general/201203.mbox/%3CCAOFYJNY%3DEjVHrWVvAedR3OKwCv-BkTaCbEu0ufp7OZR_gpCTiA%40mail.gmail.com%3E
  I think this can be solved later and need not block this release.

Alan.

On Dec 18, 2012, at 10:23 PM, Ashutosh Chauhan wrote:

 Apache Hive 0.10.0 Release Candidate 0 is available here:
 http://people.apache.org/~hashutosh/hive-0.10.0-rc0/
 
 Maven artifacts are available here:
 https://repository.apache.org/content/repositories/orgapachehive-049/org/apache/hive/
 
 
 Release notes are available at:
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12320745styleName=TextprojectId=12310843Create=Createatl_token=A5KQ-2QAV-T4JA-FDED%7C70f39c6dd3cf337eaa0e3a0359687cf608903879%7Clin
 
 
 Voting will conclude in 72 hours.
 
 Hive PMC Members: Please test and vote.
 
 Thanks,
 
 Ashutosh

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2013-01-08 Thread Alan Gates

  sense
  then why don't we try to get rid of the procedural elements that would
  only
  slow down that transition? If there is angst about specific people on Hcat
  committers list on the Hive committers side (are there any?), then I think
  that should be addressed on a case by case basis but why enforce a general
  rule. In the same vein why have a rule saying in 6-9 months a Hcat
  committer becomes a Hive committer - how is that helpful? If they are
  changing the Hcat subproject in Hive are they not already Hive committers?
  And if they gain the expertise to review and commit code in the
  SemanticAnalyzer in a few months should they not be able to do that before
  9 months are over? And if they don't get that expertise in 9 months would
  they really review and commit anything in the SemanticAnalyzer - I mean
  there are Hive committers who don't touch that piece of code today. no?
  
  Ashish
  
  
  On Wed, Dec 19, 2012 at 8:23 PM, Namit Jain nj...@fb.com wrote:
  
   I don’t agree with the proposal. It is impractical to have a Hcat
  committer
   with commit access to Hcat only portions of Hive. We cannot guarantee
  that
   a Hcat
   committer will become a Hive committer in 6-9 months, that depends on
  what
   they do
   in the next 6-9 months.
  
   The current Hcat committers should spend more time in reviewing patches,
   work on non-Hcat areas in Hive, and then gradually become a hive
   committer. They should not be given any preferential treatment, and the
   process should be same as it would be for any other hive contributor
   currently. Given that the expertise of the Hcat committers, they should
   be inline for becoming a hive committer if they continue to work in
  hive,
   but that cannot be guaranteed. I agree that some Hive committers should
  try
   and help the existing Hcat patches, and again that is voluntary and
   different
   committers cannot be assigned to different parts of the code.
  
   Thanks,
   -namit
  
  
  
  
  
  
  
   On 12/20/12 1:03 AM, Carl Steinbach cwsteinb...@gmail.com wrote:
  
   Alan's proposal sounds like a good idea to me.
   
   +1
   
   On Dec 18, 2012 5:36 PM, Travis Crawford traviscrawf...@gmail.com
   wrote:
   
Alan, I think your proposal sounds great.
   
--travis
   
On Tue, Dec 18, 2012 at 1:13 PM, Alan Gates ga...@hortonworks.com
   wrote:
 Carl, speaking just for myself and not as a representative of the
  HCat
PPMC at this point, I am coming to agree with you that HCat
  integrating
with Hive fully makes more sense.

 However, this makes the committer question even thornier.  Travis
  and
Namit, I think the shepherd proposal needs to lay out a clear and
  time
bounded path to committership for HCat committers.  Having HCat
   committers
as second class Hive citizens for the long run will not be healthy.
  I
propose the following as a starting point for discussion:

 All active HCat committers (those who have contributed or
  committed a
patch in the last 6 months) will be made committers in the HCat
  portion
only of Hive.  In addition those committers will be assigned a
   particular
shepherd who is a current Hive committer and who will be responsible
  for
mentoring them towards full Hive committership.  As a part of this
mentorship the HCat committer will review patches of other
  contributors,
contribute patches to Hive (both inside and outside of HCatalog),
   respond
to user issues on the mailing lists, etc.  It is intended that as a
   result
of this mentorship program HCat committers can become full Hive
   committers
in 6-9 months.  No new HCat only committers will be elected in Hive
   after
this.  All Hive committers will automatically also have commit
  rights on
HCatalog.

 Alan.

 On Dec 14, 2012, at 10:05 AM, Carl Steinbach wrote:

 On a functional level I don't think there is going to be much of a
 difference between the subproject option proposed by Travis and
  the
other
 option where HCatalog becomes a TLP. In both cases HCatalog and
  Hive
will
 have separate committers, separate code repositories, separate
   release
 cycles, and separate project roadmaps. Aside from ASF
  bureaucracy, I
think
 the only major difference between the two options is that the
   subproject
 route will give the rest of the community the false impression
  that
   the
two
 projects have coordinated roadmaps and a process to prevent
   overlapping
 functionality from appearing in both projects. Consequently, If
  these
are
 the only two options then I would prefer that HCatalog become a
  TLP.

 On the other hand, I also agree with many of the sentiments that
  have
 already been expressed in this thread, namely that the two
  projects
   are
 closely related and that it would benefit the community at large
  if
   the
two
 projects

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2013-01-16 Thread Alan Gates

If you think that's the best path forward that's fine.  I can't call a vote I 
don't think, since I'm not part of the Hive PMC.  But I'm happy to draft a 
resolution for you and then let you call the vote.  Should I do that?

Alan.

On Jan 11, 2013, at 4:34 PM, Carl Steinbach wrote:

 Hi Alan,
 
 I agree that submitting this for a vote is the best option.
  
 If anyone has additional proposed modifications please make them.  Otherwise 
 I propose that the Hive PMC vote on this proposal.
 
 In order for the Hive PMC to be able to vote on these changes they need to be 
 expressed in terms of one or more of the actions listed at the end of the 
 Hive project bylaws:
 
 https://cwiki.apache.org/confluence/display/Hive/Bylaws
 
 So I think we first need to amend to the bylaws in order to define the rights 
 and privileges of a submodule committer, and then separately vote the 
 HCatalog committers in as Hive submodule committers. Does this make sense?
 
 Thanks.
 
 Carl

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2013-01-18 Thread Alan Gates

I've created a wiki page for my proposed changes at 
https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Bylaws+for+Submodule+Committers

Text to be removed is struck through.  Text to be added is in italics.

Any recommended changes before we vote?

Alan.


On Jan 17, 2013, at 2:08 PM, Carl Steinbach wrote:

 Sounds like a good plan to me. Since Ashutosh is a member of both the Hive
 and HCatalog PMCs it probably makes more sense for him to call the vote,
 but I'm willing to do it too.
 
 On Wed, Jan 16, 2013 at 8:24 AM, Alan Gates ga...@hortonworks.com wrote:
 
 If you think that's the best path forward that's fine.  I can't call a
 vote I don't think, since I'm not part of the Hive PMC.  But I'm happy to
 draft a resolution for you and then let you call the vote.  Should I do
 that?
 
 Alan.
 
 On Jan 11, 2013, at 4:34 PM, Carl Steinbach wrote:
 
 Hi Alan,
 
 I agree that submitting this for a vote is the best option.
 
 If anyone has additional proposed modifications please make them.
 Otherwise I propose that the Hive PMC vote on this proposal.
 
 In order for the Hive PMC to be able to vote on these changes they need
 to be expressed in terms of one or more of the actions listed at the end
 of the Hive project bylaws:
 
 https://cwiki.apache.org/confluence/display/Hive/Bylaws
 
 So I think we first need to amend to the bylaws in order to define the
 rights and privileges of a submodule committer, and then separately vote
 the HCatalog committers in as Hive submodule committers. Does this make
 sense?
 
 Thanks.
 
 Carl

Re: Review Request 25616: HIVE-7790 Update privileges to check for update and delete

2014-09-15 Thread Alan Gates



 On Sept. 15, 2014, 7:24 a.m., Thejas Nair wrote:
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java,
   line 272
  https://reviews.apache.org/r/25616/diff/1/?file=688987#file688987line272
 
  Wouldn't select permissions for using column j in where clause be 
  needed ?
  In most databases, you get to know the number of rows getting updated. 
  Using that information, with the query in the test, you could find number 
  of columns where j = 3.
  I haven't verified what SQL spec says about this (privileges needed for 
  including columns in where clause in update statement.) Postgres says it is 
  needed :
  http://www.postgresql.org/docs/9.2/static/sql-update.html
  You must have the UPDATE privilege on the table, or at least on the 
  column(s) that are listed to be updated. You must also have the SELECT 
  privilege on any column whose values are read in the expressions or 
  condition.

I looked through the SQL spec and couldn't figure it out one way or another.  I 
chose this route because it seemed odd to require SELECT privileges for UPDATE 
and DELETE.  I see what you're saying about being able to tease out information 
such as how many rows match a where clause.

If we want to require SELECT privileges for these operations that's ok.  I'll 
just need to rework a few pieces of the patch and some of the tests.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25616/#review53316
---


On Sept. 14, 2014, 4:30 a.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25616/
 ---
 
 (Updated Sept. 14, 2014, 4:30 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-7790
 https://issues.apache.org/jira/browse/HIVE-7790
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Adds update and delete as action and adds checks for authorization during 
 update and delete. Also adds passing of updated columns in case authorizer 
 wishes to check them.
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
  53d88b0 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 298f429 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 b2f66e0 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 3aaa09c 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  93df9f4 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java
  093b4fd 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
  3236341 
   ql/src/test/queries/clientnegative/authorization_delete_nodeletepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_update_noupdatepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete_own_table.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update_own_table.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_delete_nodeletepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_update_noupdatepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete_own_table.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update_own_table.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25616/diff/
 
 
 Testing
 ---
 
 Added tests, both positive and negative, for update and delete, including 
 ability to update and delete tables created by user.  Also added tests for 
 passing correct update columns.
 
 
 Thanks,
 
 Alan Gates

Re: Review Request 25616: HIVE-7790 Update privileges to check for update and delete

2014-09-16 Thread Alan Gates



 On Sept. 16, 2014, 6:42 a.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 741
  https://reviews.apache.org/r/25616/diff/2/?file=690379#file690379line741
 
  should we skip it from ReadEntity if none of the columns are being used 
  ? Though, that case is not going to be common.
  
  eg a query like 'update table set j=null;' should not require select 
  privileges on the table, as there is no columns in where clause or value 
  expression.
  
  Note that this is a change that we can also make in future without 
  breaking users. (making a change in future to require fewer privileges will 
  not break users). Ie, it does not have to be addressed in this patch.

I don't think that makes any sense.  If I have delete permissions but not 
select permissions I can delete all rows from a table but not some rows?  That 
definitely violates the law of least astonishment.


 On Sept. 16, 2014, 6:42 a.m., Thejas Nair wrote:
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java,
   line 265
  https://reviews.apache.org/r/25616/diff/2/?file=690378#file690378line265
 
  can you also use a column in the i = expression to make sure that 
  that also gets included in the input list.
  
  eg (add another column l to table definition)
  
  update ... set i = 5 + l where j = 3;

Done.


- Alan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25616/#review53485
---


On Sept. 16, 2014, 3:35 a.m., Alan Gates wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25616/
 ---
 
 (Updated Sept. 16, 2014, 3:35 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-7790
 https://issues.apache.org/jira/browse/HIVE-7790
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Adds update and delete as action and adds checks for authorization during 
 update and delete. Also adds passing of updated columns in case authorizer 
 wishes to check them.
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
  53d88b0 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 298f429 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 b2f66e0 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessInfo.java a4df8b4 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
 3aaa09c 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
  93df9f4 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java
  093b4fd 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
  3236341 
   ql/src/test/queries/clientnegative/authorization_delete_nodeletepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_update_noupdatepriv.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_delete_own_table.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update.q PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_update_own_table.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_delete_nodeletepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_update_noupdatepriv.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_delete_own_table.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_update_own_table.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25616/diff/
 
 
 Testing
 ---
 
 Added tests, both positive and negative, for update and delete, including 
 ability to update and delete tables created by user.  Also added tests for 
 passing correct update columns.
 
 
 Thanks,
 
 Alan Gates

Re: Review Request 25616: HIVE-7790 Update privileges to check for update and delete

2014-09-16 Thread Alan Gates


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25616/
---

(Updated Sept. 16, 2014, 7:37 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-7790
https://issues.apache.org/jira/browse/HIVE-7790


Repository: hive-git


Description
---

Adds update and delete as action and adds checks for authorization during 
update and delete. Also adds passing of updated columns in case authorizer 
wishes to check them.


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
 53d88b0 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 298f429 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java b2f66e0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessInfo.java a4df8b4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java 
3aaa09c 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationUtils.java
 93df9f4 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrivilegeObject.java
 093b4fd 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 3236341 
  ql/src/test/queries/clientnegative/authorization_delete_nodeletepriv.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_update_noupdatepriv.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_delete.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_delete_own_table.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_update.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_update_own_table.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_delete_nodeletepriv.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_update_noupdatepriv.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_delete.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_delete_own_table.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_update.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_update_own_table.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25616/diff/


Testing
---

Added tests, both positive and negative, for update and delete, including 
ability to update and delete tables created by user.  Also added tests for 
passing correct update columns.


Thanks,

Alan Gates

Re: Timeline for release of Hive 0.14

2014-09-23 Thread Alan Gates

Are you wanting to track all JIRAs here or only feature JIRAs?  Once you 
branch are you open to porting bug fix patches to the branch or are you 
looking to lock it down and release it very quickly?


Regarding HIVE-7689, which is on the list, I have some concerns 
regarding that JIRA.  See dialogue on the JIRA.


Alan.


Vikram Dixit mailto:vik...@hortonworks.com
September 23, 2014 at 12:16
Hi Folks,

I have created a wiki page for tracking 0.14 release:

https://cwiki.apache.org/confluence/display/Hive/Hive+0.14+release+status

Please take a look and let me know if I have to add any more jiras to the
list. Given that the CBO branch is close to getting merged and the 
progress

being made there, I will branch in a day or so once that commit goes in.

Thanks
Vikram.


Navis류승우 mailto:navis@nexr.com
September 12, 2014 at 0:15
Hi,

I'll really appreciate if HIVE-5690 can be included, which becomes harder
and harder to rebase.
Other 79 patches I've assigned to can be held on.

Thanks,
Navis



Vaibhav Gumashta mailto:vgumas...@hortonworks.com
September 11, 2014 at 3:54
Hi Vikram,

Can we also add: https://issues.apache.org/jira/browse/HIVE-6799 
https://issues.apache.org/jira/browse/HIVE-7935 to the list.

Thanks,
--Vaibhav

On Wed, Sep 10, 2014 at 12:18 AM, Satish Mittal satish.mit...@inmobi.com

Satish Mittal mailto:satish.mit...@inmobi.com
September 10, 2014 at 0:18
Hi,
Can you please include HIVE-7892 (Thrift Set type not working with 
Hive) as

well? It is under code review.

Regards,
Satish


On Tue, Sep 9, 2014 at 2:10 PM, Suma Shivaprasad 

Suma Shivaprasad mailto:sumasai.shivapra...@gmail.com
September 9, 2014 at 1:40
Please include https://issues.apache.org/jira/browse/HIVE-7694 as well. It
is currently under review by Amareshwari and should be done in the next
couple of days.

Thanks
Suma





--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Timeline for release of Hive 0.14

2014-09-23 Thread Alan Gates

I'd like to add two to the list:  HIVE-8203 and HIVE-8239.  The fix 
version has been set to 0.14 on both.


Alan.


Vikram Dixit mailto:vik...@hortonworks.com
September 23, 2014 at 13:39
Hi Folks,

I have added all the bugs to the list that had affects version set to
0.14.0 similar to the status page of 0.13.0. This will help track and nail
down the fixes we need to get the release going. I request all devs to go
through their jiras and let me know if any of them are critical or 
blockers

that need to be included in 0.14.0. Please mark fix versions to be 0.14.0
for those.

Thanks
Vikram.


Jason Dere mailto:jd...@hortonworks.com
September 23, 2014 at 12:53
Would like to see HIVE-8102 and HIVE-7971 in if possible.

Thanks,
Jason



Lars Francke mailto:lars.fran...@gmail.com
September 23, 2014 at 12:33
Hi Vikram,

I'd like to add HIVE-7107[1] to the list. Ashutosh voted against including
it but that was based on the premise that HiveServer1 would be removed
which doesn't seem to happen in this release. So I'd like to get this 
in as

it's a very annoying issue to debug in production. It is in need of a
review. Any volunteers?

I'll take a look at the patch and will rebase if needed.

Cheers,
Lars

[1] https://issues.apache.org/jira/browse/HIVE-7107

On Tue, Sep 23, 2014 at 9:16 PM, Vikram Dixit vik...@hortonworks.com

Vikram Dixit mailto:vik...@hortonworks.com
September 23, 2014 at 12:16
Hi Folks,

I have created a wiki page for tracking 0.14 release:

https://cwiki.apache.org/confluence/display/Hive/Hive+0.14+release+status

Please take a look and let me know if I have to add any more jiras to the
list. Given that the CBO branch is close to getting merged and the 
progress

being made there, I will branch in a day or so once that commit goes in.

Thanks
Vikram.


Navis류승우 mailto:navis@nexr.com
September 12, 2014 at 0:15
Hi,

I'll really appreciate if HIVE-5690 can be included, which becomes harder
and harder to rebase.
Other 79 patches I've assigned to can be held on.

Thanks,
Navis





--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Patches to release branches

2014-09-25 Thread Alan Gates

So combining Mithun's proposal with the input from Sergey and Gopal, I 
propose:


1) When a contributor provides a patch for a high priority bug (data 
corruption, wrong results, crashes) he or she should also provide a 
patch against the branch of the latest feature release.  For example, 
once Hive 0.14 is released that will mean providing patches for trunk 
and the 0.14 branch.  I believe the test infrastructure already supports 
running the tests against alternate branches (is that correct Brock?) so 
the patches can be tested against both trunk and the release branch.
2) The release manager of the feature release (e.g. Hive 0.14) will be 
responsible for maintaining the branch with these patch fixes.  It is 
his or her call whether a given bug merits inclusion on the branch.  If 
a contributor provides a patch for trunk which in the release manager's 
opinion should also be on the branch, then the release manager can ask 
the contributor to also provide a patch for the branch.  Since whoever 
manages the feature release may not want to or be able to continue 
managing the branch post release, these release manager duties are 
transferable.  But the transfer should be clear and announced on the dev 
list.
3) In order to make these patch fixes available to Hive users we should 
strive to have frequent maintenance releases.  The frequency will depend 
on the number of bug fixes going into branch, but 6-8 weeks seems like a 
good goal.


Hive 0.14 could be the test run of this process to see what works and 
what doesn't.  Seem reasonable?


Alan.




Mithun Radhakrishnan mailto:mithun.radhakrish...@yahoo.com.INVALID
September 15, 2014 at 11:16
Hey, Gopal.
Thank you, that makes sense. I'll concede that delaying the initial 
commit till a patch is available for the recent-most release-branch 
won't always be viable. While I'd expect it to be easier to patch the 
release-branch early than late, if we (the community) would prefer a 
cloned JIRA in a separate queue, of course I'll go along. Anything to 
make the release-branch usable out of the box, without further patching.
Forgive my ignorance of the relevant protocol... Would this be a 
change in release/patch process? Does this need codifying? I'm not 
sure if this needs voting on, or even who might call a vote on this.

Mithun

On Thursday, September 11, 2014 3:15 PM, Gopal V gop...@apache.org 
wrote:




This is a very sensible proposal.

As a start, I think we need to have people open backport JIRAs, for such
issues - even if a direct merge might be hard to do with the same patch.

Immediately cherry-picking the same patch should be done if it applies
with very little modifications - but reworking the patch for an older
release is a significant overhead for the initial commit.

At the very least, we need to get past the unknowns that currently
surround the last point release against the bugs already fixed in trunk.

Once we have a backport queue, I'm sure the RMs in charge of the branch
can moderate the community on the complexity and risk factors involved.

Cheers,
Gopal



Gopal V mailto:gop...@apache.org
September 11, 2014 at 15:15
On 9/9/14, 1:52 PM, Mithun Radhakrishnan wrote:


This is a very sensible proposal.

As a start, I think we need to have people open backport JIRAs, for 
such issues - even if a direct merge might be hard to do with the same 
patch.


Immediately cherry-picking the same patch should be done if it applies 
with very little modifications - but reworking the patch for an older 
release is a significant overhead for the initial commit.


At the very least, we need to get past the unknowns that currently 
surround the last point release against the bugs already fixed in trunk.


Once we have a backport queue, I'm sure the RMs in charge of the 
branch can moderate the community on the complexity and risk factors 
involved.


Cheers,
Gopal


--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Review Request 25682: HIVE-6586 - Add new parameters to HiveConf.java after commit HIVE-6037

2014-10-02 Thread Alan Gates

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25682/#review55256
---

trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/25682/#comment95645

Rather than saying to turn on Hive transactions this should read as part
of turning on Hive transactions. This change alone won't turn on transactions.

trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/25682/#comment95647

Same comment as above about turning on transactions.

- Alan Gates

On Oct. 1, 2014, 7:27 a.m., Lefty Leverenz wrote:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25682/
---

(Updated Oct. 1, 2014, 7:27 a.m.)

Review request for hive, Carl Steinbach, Alan Gates, Navis Ryu, Prasad
Mujumdar, and Sergey Shelukhin.

Bugs: HIVE-6586
https://issues.apache.org/jira/browse/HIVE-6586

Repository: hive

Description
---

HIVE-6586 kept track of new configuration parameters and changes to parameter
descriptions when HIVE-6037 moved parameter descriptions into HiveConf.java
from hive-default.xml.template.

HIVE-6586.patch addresses all the fixes listed in the JIRA comments (except
ones that had already been fixed), tidies up some line breaks, and makes
minor edits to parameter descriptions. It also revises the descriptions of
hive.txn.xxx, hive.compactor.xxx, hive.server2.async.exec.shutdown.timeout,
and hive.security.authorization.createtable.owner.grants.

Diffs
-

trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1628586

Diff: https://reviews.apache.org/r/25682/diff/

Testing
---

Generated hive-default.xml.template (attached to HIVE-6586) from the new
HiveConf.java and reviewed the changed parameter descriptions.

File Attachments

Patch 2, rebased and fixed some issues

https://reviews.apache.org/media/uploaded/files/2014/10/01/8e4b539e-2590-4d8e-b3b5-45175a051f9d__HIVE-6586.2.patch

Thanks,

Lefty Leverenz

Re: Review Request 25682: HIVE-6586 - Add new parameters to HiveConf.java after commit HIVE-6037

2014-10-04 Thread Alan Gates

On Oct. 2, 2014, 10:08 p.m., Alan Gates wrote:
trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1322
https://reviews.apache.org/r/25682/diff/2/?file=710033#file710033line1322

Rather than saying to turn on Hive transactions this should read as
part of turning on Hive transactions. This change alone won't turn on
transactions.

Lefty Leverenz wrote:
Agreed. How about hive.txn.manager? (To turn on Hive transactions, set
to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.) Should all 3 parameters
list the others that are required? Oh ... make that 4 parameters, including
hive.support.concurrency.

For example: Set this to true on one instance of the Thrift metastore
service as part of turning on Hive transactions. The parameters
hive.txn.manager, hive.compactor.worker.threads, and hive.support.concurrency
must also be set appropriately to turn on Hive transactions.

I don't know how verbose you want to get in HiveConf.java. It might make sense
to detail all the required parameters under hive.txn.manager and then have
pointers to it from the compactor ones. I don't think you need pointers from
hive.support.concurrency since that has other uses beyond just turning on
transactions.

- Alan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25682/#review55256
---

On Oct. 1, 2014, 7:27 a.m., Lefty Leverenz wrote:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25682/
---

(Updated Oct. 1, 2014, 7:27 a.m.)

Review request for hive, Carl Steinbach, Alan Gates, Navis Ryu, Prasad
Mujumdar, and Sergey Shelukhin.

Bugs: HIVE-6586
https://issues.apache.org/jira/browse/HIVE-6586

Repository: hive

Description
---

HIVE-6586 kept track of new configuration parameters and changes to parameter
descriptions when HIVE-6037 moved parameter descriptions into HiveConf.java
from hive-default.xml.template.

Diffs
-

trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1628586

Diff: https://reviews.apache.org/r/25682/diff/

Testing
---

Generated hive-default.xml.template (attached to HIVE-6586) from the new
HiveConf.java and reviewed the changed parameter descriptions.

File Attachments

Patch 2, rebased and fixed some issues

https://reviews.apache.org/media/uploaded/files/2014/10/01/8e4b539e-2590-4d8e-b3b5-45175a051f9d__HIVE-6586.2.patch

Thanks,

Lefty Leverenz

Re: [VOTE] officially stop supporting hadoop 0.20.x in hive 0.14 ?

2014-10-08 Thread Alan Gates


+1.

Alan.


Thejas Nair mailto:the...@hortonworks.com
October 7, 2014 at 15:53
I think it is time to revisit the support for hive support for hadoop
0.20. Trying to maintain support for it puts additional burden on hive
contributors.

The last hadoop 0.20.x version was released on Feb 2010. Hadoop 1.0
was released in Dec 2011.
I believe most users have moved on to hadoop 2.x or at least hadoop
1.x . The users if any that are still on hadoop 0.20 probably don't
tend to upgrade their hive versions as well.

With the move to maven for builds in hive 0.13, we don't have the
ability to compile against hadoop 0.20. (Nobody has complains about
that AFAIK). I am not sure if hive 0.13 works well against hadoop
0.20, as it is not clear if that combination is in use. Also, most
commercial vendors seem to be focussing on testing it against hadoop
2.x.

I think it is time to do away with this added burden of attempting to
support hadoop 0.20.x versions.

Here is my +1 for officially stopping support for hadoop 0.20.x in 
hive 0.14 .


Thanks,
Thejas



--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Build appears to be broken by http://www.datanucleus.org/

2014-10-08 Thread Alan Gates

It appears that the jars we need are in maven central.  I tried removing 
datanucleus completely from my maven cache then commenting out the 
datanucleus repository in pom.xml and the jars were properly fetched 
from maven central.  Should I just put up a patch for this so we can get 
building again?


Alan.


Brock Noland mailto:br...@cloudera.com
October 8, 2014 at 10:28
http://www.datanucleus.org/ is not accessiable

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process
(default) on project hive-exec: Error resolving project artifact:
Could not transfer artifact net.hydromatic:linq4j:pom:0.4 from/to
datanucleus (http://www.datanucleus.org/downloads/maven2): Access
denied to: 
http://www.datanucleus.org/downloads/maven2/net/hydromatic/linq4j/0.4/linq4j-0.4.pom,

ReasonPhrase: Forbidden. for project net.hydromatic:linq4j:jar:0.4 -
[Help 1]


--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Build appears to be broken by http://www.datanucleus.org/

2014-10-08 Thread Alan Gates

Yeah, I found what it was.  javax.jms.  I'm trying to see if there's 
another repo I can pick it up from.


Alan.


Brock Noland mailto:br...@cloudera.com
October 8, 2014 at 11:25
That works for me. IIRC when I did the maven build I learned we used 
the DN

repo for something not DN related. Can you try the build
with -Dmaven.repo.local=/tmp/maven and ensure it builds without any cache?

If so, yes let's get rid of that repo.

Also I found it works now with the -o flag.


Alan Gates mailto:ga...@hortonworks.com
October 8, 2014 at 11:18
It appears that the jars we need are in maven central.  I tried 
removing datanucleus completely from my maven cache then commenting 
out the datanucleus repository in pom.xml and the jars were properly 
fetched from maven central.  Should I just put up a patch for this so 
we can get building again?


Alan.


Brock Noland mailto:br...@cloudera.com
October 8, 2014 at 10:28
http://www.datanucleus.org/ is not accessiable

[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process
(default) on project hive-exec: Error resolving project artifact:
Could not transfer artifact net.hydromatic:linq4j:pom:0.4 from/to
datanucleus (http://www.datanucleus.org/downloads/maven2): Access
denied to: 
http://www.datanucleus.org/downloads/maven2/net/hydromatic/linq4j/0.4/linq4j-0.4.pom,

ReasonPhrase: Forbidden. for project net.hydromatic:linq4j:jar:0.4 -
[Help 1]


--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1646 matches

Mail list logo