Re: Hive 0.7.0 Release Candidate 0

2011-02-18 Thread Ashutosh Chauhan
Wondering if https://issues.apache.org/jira/browse/HIVE-1995 should
also be considered for 0.7 ?

Ashutosh

On Thu, Feb 17, 2011 at 23:57, Carl Steinbach c...@cloudera.com wrote:
 http://people.apache.org/~cws/hive-0.7.0-candidate-0/

 Please vote.



Re: Hive 0.7.0 Release Candidate 0

2011-02-18 Thread Ashutosh Chauhan
Great. Thanks, Carl.

Ashutosh
On Fri, Feb 18, 2011 at 14:20, Carl Steinbach c...@cloudera.com wrote:
 Hi Ashutosh,

 I backported it just now. I'll cut another RC early next week to include
 this.

 Thanks.

 Carl

 On Fri, Feb 18, 2011 at 1:37 PM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Wondering if https://issues.apache.org/jira/browse/HIVE-1995 should
 also be considered for 0.7 ?

 Ashutosh

 On Thu, Feb 17, 2011 at 23:57, Carl Steinbach c...@cloudera.com wrote:
  http://people.apache.org/~cws/hive-0.7.0-candidate-0/
 
  Please vote.
 




hooks in metastore functions

2011-03-08 Thread Ashutosh Chauhan
Hi all,

I have a requirement that every time some change on metastore takes
place, we have some logic which needs to be run. For example, if a new
table is getting created in metastore I want to send a message to a
message bus. Easiest way for this to work is to add the logic in
createTable(). Control it by a hiveConf param and turn it off by
default. Alternative way is via hooks. Have this extra logic in hook
and then load and fire the hook if its available. Does anyone has an
opinion which of these two is preferable. Second one requires new hook
loading and execution logic. I am currently interested in four
functions: createTable() dropTable() addPartition() dropPartition().
Current, HiveMetaHook which exists in createTable() doesn't perfectly
fit the bill, since it is fired only when user expresses it in his
create table statement (i.e., if he has specified a storage handler)
Instead I want to have this logic always run.
If it is unclear, let me know, I can post the code  which can
demonstrate my usecase.

Ashutosh


Re: hooks in metastore functions

2011-03-09 Thread Ashutosh Chauhan
It might be possible to extend and modify the HiveMetaHook interface.
But, I think keeping them separate is better because MetaHook and
MetaStoreListener are interfaces for two different functionalities.
MetaHook is for communicating with external system if there is a need
for it. MetaStoreListener observe changes on metastore and run some
logic in response to those changes. What do you think?

Ashutosh

On Wed, Mar 9, 2011 at 13:36, John Sichi jsi...@fb.com wrote:
 Couldn't we reuse HiveMetaHook for this new purpose (with an instance loaded 
 via global config vs associated with the table handler)?

 JVS

 On Mar 8, 2011, at 2:12 PM, Ashutosh Chauhan wrote:

 Hi all,

 I have a requirement that every time some change on metastore takes
 place, we have some logic which needs to be run. For example, if a new
 table is getting created in metastore I want to send a message to a
 message bus. Easiest way for this to work is to add the logic in
 createTable(). Control it by a hiveConf param and turn it off by
 default. Alternative way is via hooks. Have this extra logic in hook
 and then load and fire the hook if its available. Does anyone has an
 opinion which of these two is preferable. Second one requires new hook
 loading and execution logic. I am currently interested in four
 functions: createTable() dropTable() addPartition() dropPartition().
 Current, HiveMetaHook which exists in createTable() doesn't perfectly
 fit the bill, since it is fired only when user expresses it in his
 create table statement (i.e., if he has specified a storage handler)
 Instead I want to have this logic always run.
 If it is unclear, let me know, I can post the code  which can
 demonstrate my usecase.

 Ashutosh




Re: Review Request: Review request for HIVE-2038

2011-04-12 Thread Ashutosh Chauhan


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java, 
  line 999
  https://reviews.apache.org/r/581/diff/1/?file=15625#file15625line999
 
  Unrelated bugfix?

Related bugfix, I will say : ) Without it, when drop partition returns from 
object store, partition object doesn't contain partition values. 


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 180
  https://reviews.apache.org/r/581/diff/1/?file=15621#file15621line180
 
  Please add this property to hive-default.xml along with a description 
  of what it does.
 

Will add it in hive-default.xml.


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
   line 955
  https://reviews.apache.org/r/581/diff/1/?file=15622#file15622line955
 
  Please run checkstyle and correct any violations included in your patch.

Will run checkstyle to check for any style violations.


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreListener.java,
   line 27
  https://reviews.apache.org/r/581/diff/1/?file=15623#file15623line27
 
  Please add some javadoc explaining the intended use of this interface. 
  
  * Are the methods called before or after an action completes? What 
  happens if a metastore operation fails?
  
  * Are the methods allowed to block? Are they run in a separate thread?
  
  * Are the methods allowed to modify the catalog objects that are passed 
  in as parameters?

Will also add in javadoc.
 
 * Methods are called after action completes. Only if action succeeds. They are 
not called if operation fails since in that case nothing has actually changed 
in metastore.

 * This is upto implementation. They can run in same thread, or they can 
schedule there work in separate thread and return immediately. 

 * I don't see a reason to disallow modification of passed in parameter 
objects. But, its mostly irrelevant here since methods are called after change 
has already been persisted on metastore. So, modifying these objects can't 
change any state on metastore. 


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreListener.java,
   line 29
  https://reviews.apache.org/r/581/diff/1/?file=15623#file15623line29
 
  Instead of passing in raw Table/Partition/Database objects it may be 
  better to instead wrap these objects in containers, e.g. CreateTableEvent, 
  DropTableEvent, etc. Eventually this interface will probably include 
  onAlterTable() and onAlterPartition(), and programmers will probably want 
  to access both the before and after versions of a Table/Partition, etc.

Whats the advantage of wrapper container objects?


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
   line 1446
  https://reviews.apache.org/r/581/diff/1/?file=15622#file15622line1446
 
  No need to reference this, right?

Right. Though, I think using this in such cases improves code readability. 


 On 2011-04-12 03:07:54, Carl Steinbach wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreListener.java,
   line 26
  https://reviews.apache.org/r/581/diff/1/?file=15623#file15623line26
 
  What do you think about changing the name to MetaStoreEventListener or 
  CatalogEventListener?

MetaStoreEventListener is fine too.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/581/#review428
---


On 2011-04-12 01:29:41, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/581/
 ---
 
 (Updated 2011-04-12 01:29:41)
 
 
 Review request for hive, Carl Steinbach, John Sichi, and Paul Yang.
 
 
 Summary
 ---
 
 Review request for HIVE-2038
 
 
 This addresses bug HIVE-2038.
 https://issues.apache.org/jira/browse/HIVE-2038
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1079575 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1079575 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreListener.java
  PRE-CREATION 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/NoOpListener.java 
 PRE-CREATION 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1079575 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
 PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore

Review Request: Removed finalizePartition() from the patch

2011-04-23 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/648/
---

Review request for hive and Carl Steinbach.


Summary
---

1. Removed finalizePartition(). Will file separate jira for it.
2. Added container objects for different event types.
3. Changed MetaStoreEventListener from interface to abstract class.
4. Modifications to allow a list of listeners instead of just one.


This addresses bug HIVE-2038.
https://issues.apache.org/jira/browse/HIVE-2038


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096112 
  trunk/conf/hive-default.xml 1096112 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1096112 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
 PRE-CREATION 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1096112 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1096112 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/AddPartitionEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/CreateDatabaseEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/CreateTableEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/DropDatabaseEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/DropPartitionEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/DropTableEvent.java
 PRE-CREATION 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/ListenerEvent.java
 PRE-CREATION 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/648/diff


Testing
---


Thanks,

Ashutosh



Review Request: Get rid of System.exit

2011-04-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/668/
---

Review request for hive, Carl Steinbach, John Sichi, and Paul Yang.


Summary
---

See HIVE-2034 for details.


Diffs
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1096871 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1096871 

Diff: https://reviews.apache.org/r/668/diff


Testing
---

Since this patch doesn't add/delete any functionality, no new tests are 
required. Passing of existing test cases will suffice.


Thanks,

Ashutosh



Review Request: Refactor HiveMetaStore to make it maintainable

2011-04-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/669/
---

Review request for hive, Carl Steinbach, John Sichi, and Paul Yang.


Summary
---

See HIVE-2135


This addresses bug HIVE-2035.
https://issues.apache.org/jira/browse/HIVE-2035


Diffs
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1096976 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreCommand.java 
PRE-CREATION 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1096976 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/URLConnectionUpdater.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/669/diff


Testing
---

Since this is a refactoring patch, no new tests are required. Ran all the tests 
in metastore. All of them passed.


Thanks,

Ashutosh



Re: Review Request: Refactor HiveMetaStore to make it maintainable

2011-04-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/669/
---

(Updated 2011-04-27 18:24:42.458082)


Review request for hive, Carl Steinbach, John Sichi, and Paul Yang.


Changes
---

Mistyped jira number.


Summary
---

See HIVE-2135


This addresses bug HIVE-2135.
https://issues.apache.org/jira/browse/HIVE-2135


Diffs
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1096976 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreCommand.java 
PRE-CREATION 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1096976 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/URLConnectionUpdater.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/669/diff


Testing
---

Since this is a refactoring patch, no new tests are required. Ran all the tests 
in metastore. All of them passed.


Thanks,

Ashutosh



Re: ANNOUNCE: New PMC Member Carl Steinbach

2011-04-28 Thread Ashutosh Chauhan
Congrats, Carl !

On Thu, Apr 28, 2011 at 05:39, Ashish Thusoo athu...@fb.com wrote:
 Congratulations Carl..

 Ashish
 On Apr 27, 2011, at 7:09 PM, John Sichi wrote:

 Hi all,

 The Hive Project Management Committee is happy to announce that Carl 
 Steinbach has been voted in as a new PMC member.  Carl is currently a very 
 active committer and has successfully managed two Hive releases (0.6 and 
 0.7).  His work on running Hive contributor meetups has helped foster an 
 ever-growing development community.

 Congratulations, Carl!

 JVS





Review Request: HIVE-2147 : Add api to send / receive message to metastore

2011-05-12 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/
---

Review request for hive and Carl Steinbach.


Summary
---

Updated patch to include missing ASF license and generated thrift code.


This addresses bug HIVE-2147.
https://issues.apache.org/jira/browse/HIVE-2147


Diffs
-

  trunk/metastore/if/hive_metastore.thrift 1102450 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
  
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1102450 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1102450 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1102450 
  
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1102450 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1102450 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1102450 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1102450 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1102450 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
 1102450 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java
 PRE-CREATION 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
1102450 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 1102450 

Diff: https://reviews.apache.org/r/738/diff


Testing
---

Updated TestMetaStoreEventListener to test new api.


Thanks,

Ashutosh



Re: Review Request: HIVE-2160 : Few code improvements in the metastore, hwi and ql packages.

2011-05-13 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/742/#review667
---

Ship it!


Thanks Chinna for the cleanup work. Looks good to me. 

- Ashutosh


On 2011-05-13 11:07:56, chinna wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/742/
 ---
 
 (Updated 2011-05-13 11:07:56)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 Few code improvements in the metastore,hwi and ql packages.
 1) Little performance Improvements 
 2) Effective varaible management.
 
 
 This addresses bug HIVE-2160.
 https://issues.apache.org/jira/browse/HIVE-2160
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/hwi/src/java/org/apache/hadoop/hive/hwi/HWISessionItem.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRUnion1.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcessor.java
  1101752 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java
  1101752 
 
 Diff: https://reviews.apache.org/r/742/diff
 
 
 Testing
 ---
 
 Ran all tests
 
 
 Thanks,
 
 chinna
 




Re: Review Request: HIVE-2147 : Add api to send / receive message to metastore

2011-05-25 Thread Ashutosh Chauhan
() call.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/738/#review713
---


On 2011-05-12 21:03:29, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/738/
 ---
 
 (Updated 2011-05-12 21:03:29)
 
 
 Review request for hive and Carl Steinbach.
 
 
 Summary
 ---
 
 Updated patch to include missing ASF license and generated thrift code.
 
 
 This addresses bug HIVE-2147.
 https://issues.apache.org/jira/browse/HIVE-2147
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1102450 
   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1102450 
   trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1102450 
   
 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
  1102450 
   
 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
  1102450 
   
 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
 1102450 
   
 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
  1102450 
   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
 1102450 
   trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1102450 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1102450 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
  1102450 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
  1102450 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
  1102450 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MessageEvent.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
 1102450 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
  1102450 
 
 Diff: https://reviews.apache.org/r/738/diff
 
 
 Testing
 ---
 
 Updated TestMetaStoreEventListener to test new api.
 
 
 Thanks,
 
 Ashutosh
 




Re: Review Request: HIVE-2188: Add a function to retrieve multiple tables on trip to the hive metastore

2011-06-03 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/831/#review753
---



trunk/metastore/if/hive_metastore.thrift
https://reviews.apache.org/r/831/#comment1571

How about calling it get_multi_table instead? multi_get_table sounds little 
confusing to me.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/831/#comment1572

You can write this more concisely using commons-lang utility method as: 
StringUtils.join(tbls,',');



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/831/#comment1576

You can get rid of tables.get(i) == null check that will never be true.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/831/#comment1573

Instead of throwing RuntimeException, create MetaException and throw that.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java
https://reviews.apache.org/r/831/#comment1574

Please add javadocs for new methods introduced in interface. Also see my 
first comment for name.



trunk/service/src/test/org/apache/hadoop/hive/service/TestHiveServer.java
https://reviews.apache.org/r/831/#comment1575

This test really belongs in the TestMetastore or some such in metastore dir 
not in HiveServer.


- Ashutosh


On 2011-06-02 23:01:00, Sohan Jain wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/831/
 ---
 
 (Updated 2011-06-02 23:01:00)
 
 
 Review request for hive, Paul Yang and Ashutosh Chauhan.
 
 
 Summary
 ---
 
 Created a function multi_get_table that retrieves multiple tables on one 
 trip to the hive metastore, saving round trip time.
 
 
 This addresses bug HIVE-2188.
 https://issues.apache.org/jira/browse/HIVE-2188
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1130342 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1130342 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1130342 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1130342 
   trunk/service/src/test/org/apache/hadoop/hive/service/TestHiveServer.java 
 1130342 
 
 Diff: https://reviews.apache.org/r/831/diff
 
 
 Testing
 ---
 
 Added a test case to testMetasore() in TestHiveServer.  Also tested for speed 
 improvements in a client session.
 
 
 Thanks,
 
 Sohan
 




Review Request: HIVE-2215

2011-06-10 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/883/
---

Review request for hive and John Sichi.


Summary
---

Follow-up for HIVE-2147.


This addresses bug HIVE-2215.
https://issues.apache.org/jira/browse/HIVE-2215


Diffs
-

  trunk/metastore/if/hive_metastore.thrift 1134443 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1134443 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1134443 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1134443 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
 1134443 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1134443 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1134443 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MarkPartitionEvent.java
 PRE-CREATION 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionEvent.java
 PRE-CREATION 
  trunk/metastore/src/model/package.jdo 1134443 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
1134443 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionSet.java
 PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 1134443 

Diff: https://reviews.apache.org/r/883/diff


Testing
---

Added test cases for new api.


Thanks,

Ashutosh



Re: Review Request: HIVE-2215

2011-06-13 Thread Ashutosh Chauhan


 On 2011-06-13 21:47:25, John Sichi wrote:
  trunk/metastore/src/model/package.jdo, line 670
  https://reviews.apache.org/r/883/diff/1/?file=20978#file20978line670
 
  Does indexing actually work on a LONGVARCHAR field across all DB's of 
  interest?

No, it doesn't. So, I reverted it back to VARCHAR.

If the rest of the patch looks alright, I will attach a new patch with this 
change.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/883/#review822
---


On 2011-06-10 21:24:13, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/883/
 ---
 
 (Updated 2011-06-10 21:24:13)
 
 
 Review request for hive and John Sichi.
 
 
 Summary
 ---
 
 Follow-up for HIVE-2147.
 
 
 This addresses bug HIVE-2215.
 https://issues.apache.org/jira/browse/HIVE-2215
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1134443 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1134443 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
  1134443 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
  1134443 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
  1134443 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1134443 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1134443 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/MarkPartitionEvent.java
  PRE-CREATION 
   
 trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionEvent.java
  PRE-CREATION 
   trunk/metastore/src/model/package.jdo 1134443 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
 1134443 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionSet.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
  1134443 
 
 Diff: https://reviews.apache.org/r/883/diff
 
 
 Testing
 ---
 
 Added test cases for new api.
 
 
 Thanks,
 
 Ashutosh
 




Re: Review Request: HIVE-2215

2011-06-14 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/883/
---

(Updated 2011-06-14 20:51:53.179968)


Review request for hive and John Sichi.


Changes
---

Updated patch with Carl's comments.
Carl, can you take a look?


Summary
---

Follow-up for HIVE-2147.


This addresses bug HIVE-2215.
https://issues.apache.org/jira/browse/HIVE-2215


Diffs (updated)
-

  trunk/metastore/if/hive_metastore.thrift 1135779 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1135779 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1135779 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1135779 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
 1135779 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1135779 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1135779 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/LoadPartitionDoneEvent.java
 PRE-CREATION 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionEvent.java
 PRE-CREATION 
  trunk/metastore/src/model/package.jdo 1135779 
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
1135779 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
 PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionRemote.java
 PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 1135779 

Diff: https://reviews.apache.org/r/883/diff


Testing
---

Added test cases for new api.


Thanks,

Ashutosh



Re: Review Request: HIVE-2215

2011-06-14 Thread Ashutosh Chauhan
, Carl Steinbach wrote:
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionSet.java,
   line 36
  https://reviews.apache.org/r/883/diff/1/?file=20980#file20980line36
 
  Can you subclass this with a remote and embedded version?

Done.


 On 2011-06-14 01:02:20, Carl Steinbach wrote:
  trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java,
   line 80
  https://reviews.apache.org/r/883/diff/1/?file=20981#file20981line80
 
  Any reason in particular why you switched to always running this test 
  in local mode? If we can only test one scenario, then I think there's more 
  value in focusing on the standalone client/server setup.

I reverted those changes.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/883/#review824
---


On 2011-06-14 20:51:53, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/883/
 ---
 
 (Updated 2011-06-14 20:51:53)
 
 
 Review request for hive and John Sichi.
 
 
 Summary
 ---
 
 Follow-up for HIVE-2147.
 
 
 This addresses bug HIVE-2215.
 https://issues.apache.org/jira/browse/HIVE-2215
 
 
 Diffs
 -
 
   trunk/metastore/if/hive_metastore.thrift 1135779 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1135779 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
  1135779 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
  1135779 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java
  1135779 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1135779 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1135779 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/LoadPartitionDoneEvent.java
  PRE-CREATION 
   
 trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionEvent.java
  PRE-CREATION 
   trunk/metastore/src/model/package.jdo 1135779 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java 
 1135779 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionRemote.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
  1135779 
 
 Diff: https://reviews.apache.org/r/883/diff
 
 
 Testing
 ---
 
 Added test cases for new api.
 
 
 Thanks,
 
 Ashutosh
 




Review Request: Review request for HIVE-2225

2011-06-21 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/940/
---

Review request for hive, Carl Steinbach and John Sichi.


Summary
---

This addresses HIVE-2225


This addresses bug HIVE-2225.
https://issues.apache.org/jira/browse/HIVE-2225


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1138099 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1138099 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1138099 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1138099 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerThread.java
 PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
 1138099 

Diff: https://reviews.apache.org/r/940/diff


Testing
---

updated a test case which exercises this code path.


Thanks,

Ashutosh



Re: Review Request: Review request for HIVE-2225

2011-06-22 Thread Ashutosh Chauhan


 On 2011-06-22 23:07:05, John Sichi wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerThread.java,
   line 1
  https://reviews.apache.org/r/940/diff/1/?file=21415#file21415line1
 
  New files need Apache headers

Added.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/940/#review888
---


On 2011-06-21 17:34:28, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/940/
 ---
 
 (Updated 2011-06-21 17:34:28)
 
 
 Review request for hive, Carl Steinbach and John Sichi.
 
 
 Summary
 ---
 
 This addresses HIVE-2225
 
 
 This addresses bug HIVE-2225.
 https://issues.apache.org/jira/browse/HIVE-2225
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1138099 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138099 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138099 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1138099 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerThread.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
  1138099 
 
 Diff: https://reviews.apache.org/r/940/diff
 
 
 Testing
 ---
 
 updated a test case which exercises this code path.
 
 
 Thanks,
 
 Ashutosh
 




Re: Review Request: Review request for HIVE-2225

2011-06-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/940/
---

(Updated 2011-06-23 02:55:08.540561)


Review request for hive, Carl Steinbach and John Sichi.


Changes
---

Updated the patch per John's comments.


Summary
---

This addresses HIVE-2225


This addresses bug HIVE-2225.
https://issues.apache.org/jira/browse/HIVE-2225


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1138719 
  trunk/conf/hive-default.xml 1138719 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1138719 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1138719 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1138719 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerTask.java
 PRE-CREATION 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
 1138719 

Diff: https://reviews.apache.org/r/940/diff


Testing
---

updated a test case which exercises this code path.


Thanks,

Ashutosh



Re: Review Request: Review request for HIVE-2225

2011-06-22 Thread Ashutosh Chauhan


 On 2011-06-22 23:07:46, John Sichi wrote:
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 221
  https://reviews.apache.org/r/940/diff/1/?file=21411#file21411line221
 
  If you agree about making this disabled by default, we could use a 
  special value such as 0 for the frequency to indicate disabled.
 

Done. Timer is now created only if this property has non-zero value.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/940/#review889
---


On 2011-06-23 02:55:08, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/940/
 ---
 
 (Updated 2011-06-23 02:55:08)
 
 
 Review request for hive, Carl Steinbach and John Sichi.
 
 
 Summary
 ---
 
 This addresses HIVE-2225
 
 
 This addresses bug HIVE-2225.
 https://issues.apache.org/jira/browse/HIVE-2225
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1138719 
   trunk/conf/hive-default.xml 1138719 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138719 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138719 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1138719 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerTask.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
  1138719 
 
 Diff: https://reviews.apache.org/r/940/diff
 
 
 Testing
 ---
 
 updated a test case which exercises this code path.
 
 
 Thanks,
 
 Ashutosh
 




Re: Review Request: Review request for HIVE-2225

2011-06-22 Thread Ashutosh Chauhan


 On 2011-06-22 23:18:43, John Sichi wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
   line 259
  https://reviews.apache.org/r/940/diff/1/?file=21412#file21412line259
 
  Why is this using a Thread instead of a Timer?

Agreed timer is better suited here then Thread. Changed to timer.


 On 2011-06-22 23:18:43, John Sichi wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerThread.java,
   line 33
  https://reviews.apache.org/r/940/diff/1/?file=21415#file21415line33
 
  6 hrs is actually configurable, right?

Yup, it is.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/940/#review891
---


On 2011-06-23 02:55:08, Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/940/
 ---
 
 (Updated 2011-06-23 02:55:08)
 
 
 Review request for hive, Carl Steinbach and John Sichi.
 
 
 Summary
 ---
 
 This addresses HIVE-2225
 
 
 This addresses bug HIVE-2225.
 https://issues.apache.org/jira/browse/HIVE-2225
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1138719 
   trunk/conf/hive-default.xml 1138719 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138719 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138719 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
 1138719 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/events/EventCleanerTask.java
  PRE-CREATION 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartition.java
  1138719 
 
 Diff: https://reviews.apache.org/r/940/diff
 
 
 Testing
 ---
 
 updated a test case which exercises this code path.
 
 
 Thanks,
 
 Ashutosh
 




Re: Review Request: HIVE-1537 - Allow users to specify LOCATION in CREATE DATABASE statement

2011-06-23 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/949/#review898
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1938

This may not be always successful. You may fail to create dirs for number 
of reasons. So, this needs to be handled gracefully. Transaction needs to 
rollback in such case and create database ddl needs to fail. For more info, 
look the first comment of Devaraj and also his attached partial patch.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1941

As previously, mkdirs() can fail, so handle similarly as in createDatabase()



trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
https://reviews.apache.org/r/949/#comment1942

Please also add a test when a create database fails because a FS operation 
fails. In such a case no metadata should get created. One way to simulate that 
is to make location unwritable then try to create database on that location.


- Ashutosh


On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/949/
 ---
 
 (Updated 2011-06-23 09:55:50)
 
 
 Review request for hive, Ning Zhang and Amareshwari Sriramadasu.
 
 
 Summary
 ---
 
 Usage:
 
 create database location 'path1';
 alter database location 'path2';
 
 After 'alter', only newly created tables will be located under the new 
 location. Tables created before 'alter' will be under 'path1'.
 
 Notes:
 --
 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it 
 private. There should only be one API to obtain the location of a database 
 and it has to accept 'Database' as an arg and hence the new method in 
 Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of 
 older API also has been changed. Hope that should be fine.
 2. One could argue why have getDatabasePath() as location can be obtained by 
 db.getLocationUri(). I wanted to retain this method to do any additional 
 processing if necessary (getDns or whatever).
 
 
 This addresses bug HIVE-1537.
 https://issues.apache.org/jira/browse/HIVE-1537
 
 
 Diffs
 -
 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
  1138011 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 1138011 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
  1138011 
   trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 
   trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/database_location.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/949/diff
 
 
 Testing
 ---
 
 1. Updated TestHiveMetaStore.java for testing the functionality - database 
 creation, alteration and table's locations as TestCliDriver outputs ignore 
 locations.
 2. Added database_location.q for testing the grammar primarily.
 
 Thanks,
 Thiruvel
 
 
 Thanks,
 
 Thiruvel
 




Re: Review Request: HIVE-1537 - Allow users to specify LOCATION in CREATE DATABASE statement

2011-06-24 Thread Ashutosh Chauhan


 On 2011-06-23 16:49:59, Ashutosh Chauhan wrote:
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
   line 591
  https://reviews.apache.org/r/949/diff/1/?file=21560#file21560line591
 
  This may not be always successful. You may fail to create dirs for 
  number of reasons. So, this needs to be handled gracefully. Transaction 
  needs to rollback in such case and create database ddl needs to fail. For 
  more info, look the first comment of Devaraj and also his attached partial 
  patch.
 
 Thiruvel Thirumoolan wrote:
 I requested Devaraj offline to handle it in a separate JIRA. I am not 
 sure about other methods having the same issue. That said, I introduced the 
 same bug with alter_database. Will fix it for create and alter databases.

Actually, problem exists in create Database even now without your patch. So, 
you are not making it any worse. I am fine if you prefer to address it in a 
followup jira.  

About alter database, I am not sure if there is any real usecase for it. Having 
a database spread across multiple locations is not a regular semantics. First 
concern is clean rollback semantics. Another is what about drop database in 
such scenarios, which directories are deleted when you drop a database, current 
one or all or one you specify in drop database ddl? You potentially need to 
persist all the locations of database in objectstore for deletion or for other 
purposes, which means a list of locationUri instead of a single string. Given 
all these, you might want to defer alter database to a new jira. Apart from 
better understanding of the usecases and semantics for alter database, doing it 
in two different jira will make this patch smaller and thus easier to get 
committed. 


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/949/#review898
---


On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/949/
 ---
 
 (Updated 2011-06-23 09:55:50)
 
 
 Review request for hive, Ning Zhang and Amareshwari Sriramadasu.
 
 
 Summary
 ---
 
 Usage:
 
 create database location 'path1';
 alter database location 'path2';
 
 After 'alter', only newly created tables will be located under the new 
 location. Tables created before 'alter' will be under 'path1'.
 
 Notes:
 --
 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it 
 private. There should only be one API to obtain the location of a database 
 and it has to accept 'Database' as an arg and hence the new method in 
 Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of 
 older API also has been changed. Hope that should be fine.
 2. One could argue why have getDatabasePath() as location can be obtained by 
 db.getLocationUri(). I wanted to retain this method to do any additional 
 processing if necessary (getDns or whatever).
 
 
 This addresses bug HIVE-1537.
 https://issues.apache.org/jira/browse/HIVE-1537
 
 
 Diffs
 -
 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
  1138011 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 1138011 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
  1138011 
   trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 
   trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/database_location.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/949/diff
 
 
 Testing
 ---
 
 1. Updated TestHiveMetaStore.java for testing the functionality - database 
 creation, alteration and table's locations as TestCliDriver outputs ignore 
 locations.
 2. Added database_location.q for testing the grammar primarily.
 
 Thanks,
 Thiruvel
 
 
 Thanks,
 
 Thiruvel
 




Re: Hive projects for Google Summer of code 2012 ?

2012-02-04 Thread Ashutosh Chauhan
Hey Bharath,

Great to see your enthusiasm for Hive! I would be happy to mentor you for
the project.  For the start, you can take a look at
https://cwiki.apache.org/confluence/display/Hive/Roadmap for a list of open
projects in Hive. The document is bit dated, so some of those projects may
not be relevant. But, its a good source to start with to see if any of
these projects excite you.

Hope it helps,
Ashutosh

On Sat, Feb 4, 2012 at 08:47, bharath vissapragada 
bharathvissapragada1...@gmail.com wrote:

 Hey list, devs,

 Google summer of code, 2012 's notification [1] has been released and
 mentoring organizations can submit their proposals to Google for opensource
 projects.

 Any of the devs interested in mentoring students on Hive projects ( any
 critical jiras etc.) ?  It would be great if any of the devs (dev list
 cc'ed) can do that on behalf of ASF .

 It would be a great opportunity for  many students to contribute patches
 to Hadoop and Hive and make their summer vacation fruitful.

 [1] http://google-melange.appspot.com/gsoc/events/google/gsoc2012

 Thanks and Regards,
 Bharath .V
 w:http://researchweb.iiit.ac.in/~bharath.v



Re: Hive projects for Google Summer of code 2012 ?

2012-02-04 Thread Ashutosh Chauhan
Hi Alexis,

Great to see your interest. Feel free to come up with concrete proposal and
submit to GSoC. Its certainly heartening to see folks interested in making
contributions to the Hive Project.

Ashutosh
On Sat, Feb 4, 2012 at 10:48, Alexis De La Cruz Toledo
alexis...@gmail.comwrote:

 Hi Ashutosh, I'm interesting in hive,
 I'd like to improve the compilation process,
 I have been that the plan query tree generated
 by Hive can be optimized, and I'd like
 to participate in Google Summer of code 2012.
 What do you say?

 Regards.


 El 4 de febrero de 2012 12:29, Ashutosh Chauhan hashut...@apache.org
 escribió:

  Hey Bharath,
 
  Great to see your enthusiasm for Hive! I would be happy to mentor you for
  the project.  For the start, you can take a look at
  https://cwiki.apache.org/confluence/display/Hive/Roadmap for a list of
  open
  projects in Hive. The document is bit dated, so some of those projects
 may
  not be relevant. But, its a good source to start with to see if any of
  these projects excite you.
 
  Hope it helps,
  Ashutosh
 
  On Sat, Feb 4, 2012 at 08:47, bharath vissapragada 
  bharathvissapragada1...@gmail.com wrote:
 
   Hey list, devs,
  
   Google summer of code, 2012 's notification [1] has been released and
   mentoring organizations can submit their proposals to Google for
  opensource
   projects.
  
   Any of the devs interested in mentoring students on Hive projects ( any
   critical jiras etc.) ?  It would be great if any of the devs (dev list
   cc'ed) can do that on behalf of ASF .
  
   It would be a great opportunity for  many students to contribute
 patches
   to Hadoop and Hive and make their summer vacation fruitful.
  
   [1] http://google-melange.appspot.com/gsoc/events/google/gsoc2012
  
   Thanks and Regards,
   Bharath .V
   w:http://researchweb.iiit.ac.in/~bharath.v
  
 



 --
 Ing. Alexis de la Cruz Toledo.
 *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco.
 México,
 D.F, 07360 *
 *CINVESTAV, DF.*



Re: Hive build is back to green

2012-02-16 Thread Ashutosh Chauhan
Good idea! Created HIVE-2811 for this.

Thanks,
Ashutosh
On Thu, Feb 16, 2012 at 14:41, Carl Steinbach c...@cloudera.com wrote:

 Great news! Thanks for fixing this Ashutosh!


  comparisons. Fix was to do export LANG=en_US.UTF-8 in environment of
 build
  machine.


 Is this something that we can set in Hive's build.xml file?

 Thanks.

 Carl



Re: 'arc diff' failing with Invalid or missing field 'Test Plan': You must provide a test plan.

2012-03-01 Thread Ashutosh Chauhan
Hi Carl,

Include in your git commit message following line
Test Plan: Include your test plan here.

It is looking for string Test Plan in your commit message and fails if
cant find one.

Hope it helps,
Ashutosh

On Thu, Mar 1, 2012 at 14:45, Carl Steinbach c...@cloudera.com wrote:

 Hey,

 Today I started getting the following error when I try to create a
 phabricator review request using arc:

 % arc diff --jira HIVE-2831

 Exception:
 Invalid or missing field 'Test Plan': You must provide a test plan.
 (Run with --trace for a full exception trace.)


 Here's the complete trace:

 % arc --trace diff --jira HIVE-2831
 Loading phutil library 'arc_jira_lib' from
 '/Users/carl/Work/repos/hive4/.arc_jira_lib'...
  [0] conduit conduit.connect()
  [0] conduit 318,295 us
  [1] exec $ (cd '/Users/carl/Work/repos/hive4'; git rev-parse
 --show-cdup)
  [1] exec 14,662 us
  [2] exec $ (cd '/Users/carl/Work/repos/hive4/'; git rev-parse
 --verify HEAD^)
  [2] exec 16,343 us
  [3] exec $ (cd '/Users/carl/Work/repos/hive4/'; git log
 --first-parent --format=medium 'HEAD^'..HEAD)
  [3] exec 15,040 us
  [4] conduit differential.parsecommitmessage()
  [4] conduit 547,222 us

 Fatal error: Uncaught exception
 'ArcanistDifferentialCommitMessageParserException' with message 'Invalid or
 missing field 'Test Plan': You must provide a test plan.' in

 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php:88
 Stack trace:
 #0

 /Users/carl/Work/repos/hive4/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(88):

 ArcanistDifferentialCommitMessage-pullDataFromConduit(Object(ConduitClient))
 #1

 /Users/carl/Work/repos/hive4/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(364):
 ArcJIRAConfiguration-willRunDiffWorkflow()
 #2 /Users/carl/.local/pkg/arcanist/scripts/arcanist.php(264):
 ArcJIRAConfiguration-willRunWorkflow('diff', Object(ArcanistDiffWorkflow))
 #3 {main}
  thrown in

 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php
 on line 88


 Anyone know what's going on here?

 Thanks.

 Carl



Re: 'arc diff' failing with Invalid or missing field 'Test Plan': You must provide a test plan.

2012-03-01 Thread Ashutosh Chauhan
I don't know if there is a way to disable it. But, then I don't know much
about phabricator/arc infra.

On Thu, Mar 1, 2012 at 15:15, Carl Steinbach c...@cloudera.com wrote:

 Thanks for the tip. Is there any way to disable this behavior?

 On Thu, Mar 1, 2012 at 2:56 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Hi Carl,
 
  Include in your git commit message following line
  Test Plan: Include your test plan here.
 
  It is looking for string Test Plan in your commit message and fails if
  cant find one.
 
  Hope it helps,
  Ashutosh
 
  On Thu, Mar 1, 2012 at 14:45, Carl Steinbach c...@cloudera.com wrote:
 
   Hey,
  
   Today I started getting the following error when I try to create a
   phabricator review request using arc:
  
   % arc diff --jira HIVE-2831
  
   Exception:
   Invalid or missing field 'Test Plan': You must provide a test plan.
   (Run with --trace for a full exception trace.)
  
  
   Here's the complete trace:
  
   % arc --trace diff --jira HIVE-2831
   Loading phutil library 'arc_jira_lib' from
   '/Users/carl/Work/repos/hive4/.arc_jira_lib'...
[0] conduit conduit.connect()
[0] conduit 318,295 us
[1] exec $ (cd '/Users/carl/Work/repos/hive4'; git rev-parse
   --show-cdup)
[1] exec 14,662 us
[2] exec $ (cd '/Users/carl/Work/repos/hive4/'; git rev-parse
   --verify HEAD^)
[2] exec 16,343 us
[3] exec $ (cd '/Users/carl/Work/repos/hive4/'; git log
   --first-parent --format=medium 'HEAD^'..HEAD)
[3] exec 15,040 us
[4] conduit differential.parsecommitmessage()
[4] conduit 547,222 us
  
   Fatal error: Uncaught exception
   'ArcanistDifferentialCommitMessageParserException' with message
 'Invalid
  or
   missing field 'Test Plan': You must provide a test plan.' in
  
  
 
 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php:88
   Stack trace:
   #0
  
  
 
 /Users/carl/Work/repos/hive4/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(88):
  
  
 
 ArcanistDifferentialCommitMessage-pullDataFromConduit(Object(ConduitClient))
   #1
  
  
 
 /Users/carl/Work/repos/hive4/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(364):
   ArcJIRAConfiguration-willRunDiffWorkflow()
   #2 /Users/carl/.local/pkg/arcanist/scripts/arcanist.php(264):
   ArcJIRAConfiguration-willRunWorkflow('diff',
  Object(ArcanistDiffWorkflow))
   #3 {main}
thrown in
  
  
 
 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php
   on line 88
  
  
   Anyone know what's going on here?
  
   Thanks.
  
   Carl
  
 



Re: Automatic/parallel patch testing

2012-03-07 Thread Ashutosh Chauhan
I am very much in favor of
https://issues.apache.org/jira/browse/HIVE-1175 since
then it publishes back on jira.. and everyone is on same page.
I think Project VP has access to apache hudson.

John / Namit,
Can you set this up for Hive?

Ashutosh

On Wed, Mar 7, 2012 at 13:35, Edward Capriolo edlinuxg...@gmail.com wrote:

 I am trying to get myself more involved in the patch review and
 committing process, but running ant tests takes multiple hours.

 Two ideas:

 https://issues.apache.org/jira/browse/HIVE-1175

 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html

 Can we get a farm of test servers to get testing done faster what are
 other committers currently doing?
 I would not mind committing resources servers/$ to the cause

 Thanks,

 Edward



Re: Potential bug around hive merging of small files

2012-03-13 Thread Ashutosh Chauhan
This does look like a bug. Shrijeet, mind opening a jira and attaching your
patch there.

Thanks,
Ashutosh
On Mon, Mar 12, 2012 at 16:29, Shrijeet Paliwal shrij...@rocketfuel.comwrote:

 I had a type in last email. Settings are as follows

 hive set mapred.min.split.size.per.node=10;
 hive set mapred.min.split.size.per.rack=10;
 hive set mapred.max.split.size=10;
 hive set hive.merge.size.per.task=10;
 hive set hive.merge.smallfiles.avgsize=10;
 hive set hive.merge.size.smallfiles.avgsize=10;*hive set
 hive.merge.mapfiles=true;*hive set hive.merge.mapredfiles=true;

 *hive set hive.mergejob.maponly=false;*




 On Mon, Mar 12, 2012 at 4:27 PM, Shrijeet Paliwal
 shrij...@rocketfuel.comwrote:

  Hive Version: Hive 0.8 (last commit SHA
   b581a6192b8d4c544092679d05f45b2e50d42b45 )
 
  Hadoop version : chd3u0
 
  I am trying to use the hive merge small file feature by setting all the
  necessary params.
  I am disabling use of CombineHiveInputFormat since my input is compressed
  text.
 
  hive set mapred.min.split.size.per.node=10;
  hive set mapred.min.split.size.per.rack=10;
  hive set mapred.max.split.size=10;
  hive set hive.merge.size.per.task=10;
  hive set hive.merge.smallfiles.avgsize=10;
  hive set hive.merge.size.smallfiles.avgsize=10;
  hive set hive.merge.mapfiles=false;
  hive set hive.merge.mapredfiles=true;
 
 
  The plan decides to launch two MR jobs but after first job succeeds I get
  runt time error
 
  java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but
  reduce operator specified
 
  I think the problem can be fixed by using this patch I came with :
  https://gist.github.com/2025303
 
  Of course my understanding and hence this patch can be totally wrong.
  Please provide feedback.
 



Hive 0.9 release

2012-04-02 Thread Ashutosh Chauhan
Hi all,

Branch for 0.8-r2 was created on Dec 7, almost four months ago. Between
then and now lots of cool stuff has landed in trunk waiting to be released
and get to users. I think its a good time now to get the ball rolling for
0.9 release.   If this sounds good, I would propose to cut a branch for 0.9
later this week. Then we can focus on stabilizing the branch and subsequent
release from it.  Thoughts?

Thanks,
Ashutosh


Re: Hive 0.9 release

2012-04-02 Thread Ashutosh Chauhan
Here is a list of jiras which I plan to get in 0.9.

HIVE-2084
HIVE-2822
HIVE-2764
HIVE-538

I will work with authors of these patches to see these can get in. Others,
please feel free to add this list.

Thanks,
Ashutosh

On Mon, Apr 2, 2012 at 18:39, Carl Steinbach c...@cloudera.com wrote:

 I'm +1 on doing an 0.9.0 release, but would also like to suggest that we
 put together a list of 0.9.0 blockers before cutting the release branch. In
 the past we have frequently underestimated the amount of work required to
 get trunk into a releasable state, with the consequence that we up wasting
 time doing a lot of backports from trunk to the release branch. It would be
 great if we could avoid all of that this time around.

 Thanks.

 Carl

 On Mon, Apr 2, 2012 at 6:33 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Hi all,
 
  Branch for 0.8-r2 was created on Dec 7, almost four months ago. Between
  then and now lots of cool stuff has landed in trunk waiting to be
 released
  and get to users. I think its a good time now to get the ball rolling for
  0.9 release.   If this sounds good, I would propose to cut a branch for
 0.9
  later this week. Then we can focus on stabilizing the branch and
 subsequent
  release from it.  Thoughts?
 
  Thanks,
  Ashutosh
 



Re: Hive 0.9 release

2012-04-06 Thread Ashutosh Chauhan
Hi All,

Seems like we have an agreement. So, unless someone has other ideas, I will
cut the branch for 0.9 on 4/9. Below is the consolidated list which people
have requested for 0.9, so as and when these gets checked-in trunk, we will
merge it back in 0.9.

2084
2764
538
2646
2777
2585
2883
2926

Thanks,
Ashutosh

On Mon, Apr 2, 2012 at 19:02, Thomas Weise t...@yahoo-inc.com wrote:

 Would be great to get HIVE-2646 included.

 Thanks,
 Thomas


 On 4/2/12 6:59 PM, Ashutosh Chauhan hashut...@apache.org wrote:

  Here is a list of jiras which I plan to get in 0.9.
 
  HIVE-2084
  HIVE-2822
  HIVE-2764
  HIVE-538
 
  I will work with authors of these patches to see these can get in.
 Others,
  please feel free to add this list.
 
  Thanks,
  Ashutosh
 
  On Mon, Apr 2, 2012 at 18:39, Carl Steinbach c...@cloudera.com wrote:
 
  I'm +1 on doing an 0.9.0 release, but would also like to suggest that we
  put together a list of 0.9.0 blockers before cutting the release
 branch. In
  the past we have frequently underestimated the amount of work required
 to
  get trunk into a releasable state, with the consequence that we up
 wasting
  time doing a lot of backports from trunk to the release branch. It
 would be
  great if we could avoid all of that this time around.
 
  Thanks.
 
  Carl
 
  On Mon, Apr 2, 2012 at 6:33 PM, Ashutosh Chauhan hashut...@apache.org
  wrote:
 
  Hi all,
 
  Branch for 0.8-r2 was created on Dec 7, almost four months ago. Between
  then and now lots of cool stuff has landed in trunk waiting to be
  released
  and get to users. I think its a good time now to get the ball rolling
 for
  0.9 release.   If this sounds good, I would propose to cut a branch for
  0.9
  later this week. Then we can focus on stabilizing the branch and
  subsequent
  release from it.  Thoughts?
 
  Thanks,
  Ashutosh
 
 




Re: Hive 0.9 release

2012-04-09 Thread Ashutosh Chauhan
As per the plan I am going to create a branch now. Please hold off any
commits till I send an email for all-clear.

Thanks,
Ashutosh

On Fri, Apr 6, 2012 at 15:00, Owen O'Malley omal...@apache.org wrote:

 I think we also need to get the RAT report cleaned up. Apache projects
 aren't supposed to release while they have files without the Apache
 header. I've filed HIVE-2930 to fix all of the issues.

 While working on it, I found that one of the files was added by
 HIVE-2246. HIVE-2246 was contributed by Sohan Jain, who hasn't filed
 an ICLA, and doesn't have the jira box checked for contribution. Does
 someone know him and can ask him to state on the jira that he intended
 to contribute it? Failing that, I believe he was working at Facebook
 at the time, so someone else who is still there can upload the patch
 to the jira?

 All of this brings up a challenge in that Phabricator and the Apache
 review tool upload patches to jira without providing a way to check
 the contribute to Apache box. Without the checkbox we should only
 commit patches from people who have filed ICLAs. Is there a way to add
 an option the arc command that will check the box? Even having it
 *always* check the box is better than having it not check the box.
 (Although it should warn users that it is doing so.)

 Thoughts?

 -- Owen



Re: Hive 0.9 release

2012-04-09 Thread Ashutosh Chauhan
All-clear. Trunk is now open for commits.

Since HIVE-2929 have resulted in intermittent test failures. (See,
HIVE-2937), I branched right before HIVE-2929.  Additionally, I merged in
HIVE-2764 in 0.9

I also added version 0.10 on jira, so any commits on trunk now must have
0.10 as fix version.

Thanks,
Ashutosh

On Mon, Apr 9, 2012 at 17:01, Ashutosh Chauhan hashut...@apache.org wrote:

 As per the plan I am going to create a branch now. Please hold off any
 commits till I send an email for all-clear.

 Thanks,
 Ashutosh


 On Fri, Apr 6, 2012 at 15:00, Owen O'Malley omal...@apache.org wrote:

 I think we also need to get the RAT report cleaned up. Apache projects
 aren't supposed to release while they have files without the Apache
 header. I've filed HIVE-2930 to fix all of the issues.

 While working on it, I found that one of the files was added by
 HIVE-2246. HIVE-2246 was contributed by Sohan Jain, who hasn't filed
 an ICLA, and doesn't have the jira box checked for contribution. Does
 someone know him and can ask him to state on the jira that he intended
 to contribute it? Failing that, I believe he was working at Facebook
 at the time, so someone else who is still there can upload the patch
 to the jira?

 All of this brings up a challenge in that Phabricator and the Apache
 review tool upload patches to jira without providing a way to check
 the contribute to Apache box. Without the checkbox we should only
 commit patches from people who have filed ICLAs. Is there a way to add
 an option the arc command that will check the box? Even having it
 *always* check the box is better than having it not check the box.
 (Although it should warn users that it is doing so.)

 Thoughts?

 -- Owen





Re: Looking at the columns table

2012-04-11 Thread Ashutosh Chauhan
Hey Ed,

Your thinking is correct and has been implemented in
https://issues.apache.org/jira/browse/HIVE-2246

Time to upgrade to 0.8 :)

Thanks,
Ashutosh

On Wed, Apr 11, 2012 at 07:53, Edward Capriolo edlinuxg...@gmail.comwrote:

 Hey all. Our metastore in mysql is fairly large over 12GB. All the
 storage here is the columns table. It seems that each column is stored
 for each partition/storage descriptor as a one-many relationship.

 In our case all the partitions have the same column definition. My
 thinking. Should the relationship from columns-partition/storage
 descriptor be a many-many? In this way we only store the column once
 and the current column table can reference the primary key of this
 column. This should bring the size of this table down really
 drastically.

 Since every other table in the metastore is so small this huge columns
 table looks like the only scalability choke point we have.

 Edward



Re: Problems with Arc/Phabricator

2012-04-11 Thread Ashutosh Chauhan
+1 on moving away from arc/phabricator. It works great when it works, but
most of the time it doesnt work.

Ashutosh

On Wed, Apr 11, 2012 at 11:57, Owen O'Malley omal...@apache.org wrote:

 On Wed, Apr 11, 2012 at 11:48 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
  If we are going to switch from fabricator we just might as well go
  back to not using anything. Review board was really clunky and
  confusing.

 I'm mostly +1 to that. If no one is supporting phabricator, then it
 won't work for long. Personally, I'd love it if we could move Hive to
 git completely. Has anyone used gerrit? The videos of it make it look
 better than sliced bread.

 -- Owen



Re: Problems with Arc/Phabricator

2012-04-11 Thread Ashutosh Chauhan
Is mac only supported OS ? Arc doesn't work for me on linux, which is
unfortunate since thats where I do all my testing.

Thanks,
Ashutosh
On Wed, Apr 11, 2012 at 17:37, John Sichi jsi...@gmail.com wrote:

 CC'ing David Recordon, who can probably help with a point of contact
 for coordinating future Phabricator upgrades.

 It looks like the test plan problem mentioned below (which affects
 git, but not svn) was introduced when the reviews.facebook.net
 Phabricator server was upgraded Feb 23.  I've committed a change to
 the arc-jira module which should deal with it:


 https://github.com/facebook/arc-jira/commit/b62b5976ec9a974ed102c2f55b530edde48cfaa5

 So if you run ant arc-setup in your Hive sandbox, you should be good to go.

 JVS

 On Wed, Apr 11, 2012 at 3:37 PM, Carl Steinbach c...@cloudera.com wrote:
  Hi John,
 
 
  Regarding the test plans:  Carl, could you be more specific about what
  is going wrong so I can attempt to reproduce the problem?
 
 
  At some point Arc started requiring that the commit message contain a
 Test
  Plan string, or maybe this has always been a requirement and it was just
  automatically added before? Anyway, right now you have to manually add
 this
  or you get the following error:
 
  % git log -1
  commit 2649ca167182bb02823b3fb00bbe7602f591717e
  Author: Carl Steinbach c...@cloudera.com
  Date:   Wed Apr 11 15:12:53 2012 -0700
 
  HIVE-2947. Test Phabricator
 
  % arc diff --trace --jira HIVE-2947
  Loading phutil library 'arc_jira_lib' from
  '/Users/carl/Work/repos/hive-test/.arc_jira_lib'...
  [0] conduit conduit.connect()
   [0] conduit 329,414 us
  [1] exec $ (cd '/Users/carl/Work/repos/hive-test'; git rev-parse
  --show-cdup)
   [1] exec 16,731 us
  [2] exec $ (cd '/Users/carl/Work/repos/hive-test/'; git rev-parse
  --verify HEAD^)
   [2] exec 20,879 us
  [3] exec $ (cd '/Users/carl/Work/repos/hive-test/'; git log
  --first-parent --format=medium 'HEAD^'..HEAD)
   [3] exec 17,852 us
  [4] conduit differential.parsecommitmessage()
   [4] conduit 558,248 us
 
  Fatal error: Uncaught exception
  'ArcanistDifferentialCommitMessageParserException' with message 'Invalid
 or
  missing field 'Test Plan': You must provide a test plan.' in
 
 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php:88
  Stack trace:
  #0
 
 /Users/carl/Work/repos/hive-test/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(88):
 
 ArcanistDifferentialCommitMessage-pullDataFromConduit(Object(ConduitClient))
  #1
 
 /Users/carl/Work/repos/hive-test/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php(368):
  ArcJIRAConfiguration-willRunDiffWorkflow()
  #2 /Users/carl/.local/pkg/arcanist/scripts/arcanist.php(264):
  ArcJIRAConfiguration-willRunWorkflow('diff',
 Object(ArcanistDiffWorkflow))
  #3 {main}
thrown in
 
 /Users/carl/.local/pkg/arcanist/src/differential/commitmessage/ArcanistDifferentialCommitMessage.php
  on line 88
 
  Thanks.
 
  Carl
 



[VOTE] Apache Hive 0.9.0 Release Candidate 0

2012-04-13 Thread Ashutosh Chauhan
Hey all,

Apache Hive 0.9.0-rc0 is out and available at
http://people.apache.org/~hashutosh/hive-0.9.0-rc0/

Release notes are available at:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

Please give it a try, let us know.

Hive PMC members: Please test and vote.

Thanks,
Ashutosh


Re: [VOTE] Apache Hive 0.9.0 Release Candidate 0

2012-04-13 Thread Ashutosh Chauhan
Couple more points:

Maven artifacts are available at
https://repository.apache.org/content/repositories/orgapachehive-043/ for
folks to try out.

Vote runs for 3 business days so will expire on Wednesday, 4/18.

Thanks,
Ashutosh

On Fri, Apr 13, 2012 at 11:50, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hey all,

 Apache Hive 0.9.0-rc0 is out and available at
 http://people.apache.org/~hashutosh/hive-0.9.0-rc0/

 Release notes are available at:
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

 Please give it a try, let us know.

 Hive PMC members: Please test and vote.

 Thanks,
 Ashutosh



Re: [VOTE] Apache Hive 0.9.0 Release Candidate 0

2012-04-18 Thread Ashutosh Chauhan
Hey Lars,

Thanks for taking a look. HIVE-1634 introduced new storage type for hbase
tables, namely binary. Since, bug manifests itself only for binary storage
type. This doesnt count as a regression since functionality for binary
storage itself was added through HIVE-1634. Since, this is not a regression
of existing functionality, it won't count as a blocker for 0.9 release.

Nonetheless, other folks have found other problems in RC0, so I have to
respin. Thus, I will consider HIVE-2958 fix for RC1.

Thanks,
Ashutosh

On Tue, Apr 17, 2012 at 23:46, Lars Francke lars.fran...@gmail.com wrote:

 Hey,

 thanks for putting up the RC. We tried it yesterday and we stumbled
 across HIVE-2958 which seems like a bug that should be fixed before
 release because it was introduced with HIVE-1634 which is new to 0.9
 too and breaks GROUP BY queries on HBase which were working before.

 -1 (non-binding)

 Thanks,
 Lars



Re: Hive 0.9 now broken on HBase 0.90 ?

2012-04-18 Thread Ashutosh Chauhan
Hi Tim,

Sorry that it broke your setup. Decision to move to hbase-0.92 was made in
https://issues.apache.org/jira/browse/HIVE-2748

Thanks,
Ashutosh

On Wed, Apr 18, 2012 at 11:42, Tim Robertson timrobertson...@gmail.comwrote:

 Hi all,

 This is my first post to hive-dev so please go easy on me...

 I built Hive from trunk (0.90) a couple of weeks ago and have been using it
 against HBase, and today patched it with the offering of HIVE-2958 and it
 all worked fine.

 I just tried an Oozie workflow, built using Maven and the Apache snapshot
 repository to get the 0.90 snapshot.  It fails with the following:

 java.lang.NoSuchMethodError:

 org.apache.hadoop.hbase.mapred.TableMapReduceUtil.initCredentials(Lorg/apache/hadoop/mapred/JobConf;)V
at
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:419)
at
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)


 I believe the source of the issue could be this commit which happened after
 I built from trunk a couple weeks ago:


 http://mail-archives.apache.org/mod_mbox/hive-commits/201204.mbox/%3c20120409202655.bdb5d2388...@eris.apache.org%3E

 Is there a decision to make hive 0.9  require HBase 0.92.0+ ?  It would be
 awesome if it still worked on 0.90.4 since CDH3 uses that.

 Hope this makes sense,
 Tim
 (suffering classpath hell)



Re: [VOTE] Apache Hive 0.9.0 Release Candidate 0

2012-04-19 Thread Ashutosh Chauhan
This vote stands cancelled because of various problems people have found in
RC0. Thanks to all who tried RC0.  I will respin RC1 shortly.

Thanks,
Ashutosh

On Fri, Apr 13, 2012 at 11:50, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hey all,

 Apache Hive 0.9.0-rc0 is out and available at
 http://people.apache.org/~hashutosh/hive-0.9.0-rc0/

 Release notes are available at:
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

 Please give it a try, let us know.

 Hive PMC members: Please test and vote.

 Thanks,
 Ashutosh



hive 0.9.0 RC1

2012-04-19 Thread Ashutosh Chauhan
RC0 failed to pass because of variety of reasons. In the meantime, various
folks have requested for inclusion of other fixes in 0.9. Following is the
list.

Following list is for committers to review these patches and to get them
committed in 0.9 branch. These are in Patch Available status.
HIVE-2958
HIVE-2777
HIVE-2646
HIVE-2883
HIVE-2585
HIVE-538
HIVE-2904

Following list is for contributors/committers to contribute patches. These
are in Open status.
HIVE-2961
HIVE-2965
HIVE-2966

Thanks,
Ashutosh


Re: Problems with Arc/Phabricator

2012-04-19 Thread Ashutosh Chauhan
Hit a new problem with arc today:

Fatal error: Uncaught exception 'Exception' with message 'Host returned
HTTP/200, but invalid JSON data in response to a Conduit method call:
br /
bWarning/b:  Unknown: POST Content-Length of 9079953 bytes exceeds the
limit of 8388608 bytes in bUnknown/b on line b0/bbr /
for(;;);{result:null,error_code:ERR-INVALID-SESSION,error_info:Session
key is not present.}' in
/Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php:48
Stack trace:
#0
/Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(62):
ConduitFuture-didReceiveResult(Array)
#1
/Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(39):
FutureProxy-getResult()
#2
/Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitClient.php(52):
FutureProxy-resolve()
#3
/Users/ashutosh/work/hive/arcanist/src/workflow/diff/ArcanistDiffWorkflow.php(341):
ConduitClient-callMethodSynchronous('differential.cr...', Array)
#4 /Users/ashutosh/work/hive/arcanist/scripts/arcanist.php(266):
ArcanistDiffWo in
/Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php on
line 48


Any ideas how to solve this?

Thanks,
Ashutosh

On Wed, Apr 11, 2012 at 18:37, Edward Capriolo edlinuxg...@gmail.comwrote:

 I think the most practical solution is try and use arc/phab and then
 if there is a problem fall back to Jira and do it the old way.

 Edward

 On Wed, Apr 11, 2012 at 7:17 PM, Carl Steinbach c...@cloudera.com wrote:
  +1 to switching over to Git.
 
  As for the rest of the Phabricator/Gerrit/Reviewboard discussion, I think
  we should pick this up again at the contributor meeting on Wednesday.
 
  Thanks.
 
  Carl
 
  On Wed, Apr 11, 2012 at 12:19 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:
 
  +1 on moving away from arc/phabricator. It works great when it works,
 but
  most of the time it doesnt work.
 
  Ashutosh
 
  On Wed, Apr 11, 2012 at 11:57, Owen O'Malley omal...@apache.org
 wrote:
 
   On Wed, Apr 11, 2012 at 11:48 AM, Edward Capriolo 
 edlinuxg...@gmail.com
  
   wrote:
If we are going to switch from fabricator we just might as well go
back to not using anything. Review board was really clunky and
confusing.
  
   I'm mostly +1 to that. If no one is supporting phabricator, then it
   won't work for long. Personally, I'd love it if we could move Hive to
   git completely. Has anyone used gerrit? The videos of it make it look
   better than sliced bread.
  
   -- Owen
  
 



Re: hive 0.9.0 RC1

2012-04-19 Thread Ashutosh Chauhan
Release is not going to be blocked on it. If the patch gets committed on
trunk by the time I roll RC1 I will merge it in. But, first step is to have
it in trunk.

Thanks,
Ashutosh

On Thu, Apr 19, 2012 at 17:32, Edward Capriolo edlinuxg...@gmail.comwrote:

 Am I missing something about?

 https://issues.apache.org/jira/browse/HIVE-2777

 I see no way to access this feature from the hive QL language. Should
 we delay releases on features not usable?

 Edward


 On Thu, Apr 19, 2012 at 1:28 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:
  RC0 failed to pass because of variety of reasons. In the meantime,
 various
  folks have requested for inclusion of other fixes in 0.9. Following is
 the
  list.
 
  Following list is for committers to review these patches and to get them
  committed in 0.9 branch. These are in Patch Available status.
  HIVE-2958
  HIVE-2777
  HIVE-2646
  HIVE-2883
  HIVE-2585
  HIVE-538
  HIVE-2904
 
  Following list is for contributors/committers to contribute patches.
 These
  are in Open status.
  HIVE-2961
  HIVE-2965
  HIVE-2966
 
  Thanks,
  Ashutosh



Re: POST limit

2012-04-20 Thread Ashutosh Chauhan
Hey John,

Yeah this is exceptionally a large patch.
https://issues.apache.org/jira/browse/HIVE-2965

Thanks,
Ashutosh

On Thu, Apr 19, 2012 at 23:19, John Sichi jsi...@gmail.com wrote:

 Ashutosh, are you submitting an exceptionally large patch of some kind?


 http://stackoverflow.com/questions/6279897/php-post-content-length-of-11933650-bytes-exceeds-the-limit-of-8388608-bytes

 We could try bumping up that limit on the server side, but first it
 would be good to find out whether that is really the problem (and if
 so what is contributing to such a big size).

 JVS

 On Thu, Apr 19, 2012 at 7:35 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:
  Hit a new problem with arc today:
 
  Fatal error: Uncaught exception 'Exception' with message 'Host returned
  HTTP/200, but invalid JSON data in response to a Conduit method call:
  br /
  bWarning/b:  Unknown: POST Content-Length of 9079953 bytes exceeds
 the
  limit of 8388608 bytes in bUnknown/b on line b0/bbr /
 
 for(;;);{result:null,error_code:ERR-INVALID-SESSION,error_info:Session
  key is not present.}' in
 
 /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php:48
  Stack trace:
  #0
  /Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(62):
  ConduitFuture-didReceiveResult(Array)
  #1
  /Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(39):
  FutureProxy-getResult()
  #2
 
 /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitClient.php(52):
  FutureProxy-resolve()
  #3
 
 /Users/ashutosh/work/hive/arcanist/src/workflow/diff/ArcanistDiffWorkflow.php(341):
  ConduitClient-callMethodSynchronous('differential.cr...', Array)
  #4 /Users/ashutosh/work/hive/arcanist/scripts/arcanist.php(266):
  ArcanistDiffWo in
  /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php
 on
  line 48
 
 
  Any ideas how to solve this?
 
  Thanks,
  Ashutosh



[VOTE] Apache Hive 0.9.0 Release Candidate 1

2012-04-23 Thread Ashutosh Chauhan
Hey all,

Apache Hive 0.9.0 Release Candidate 1 is available here:
http://people.apache.org/~hashutosh/hive-0.9.0-rc1/

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-084/

Change List is available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

Voting will conclude in 72 hours.
Hive PMC Members: Please test and vote.

Thanks,
Ashutosh


FixVersion in jira

2012-04-23 Thread Ashutosh Chauhan
In case you are wondering whats up with deluge of emails from jira, here is
whats going on.

Committers,
FixVersion is targeted for usage by committers. It should be set by a
committer at the time of commit. In most cases, commit is going to be only
on trunk, in which case next release version number should be picked. If
commit is also made on an already released branch then, next release
version number of that branch should be added in addition. For example, if
you make a commit on trunk only now, fixVersion is 0.10.0. If you make a
commit on trunk and  0.9 branch, then fixVersion is 0.9.1 and 0.10.0.
So, request to committers is to please mark the fixVersion while committing
patches.

Contributors,
Please don't set the fixVersion. While submitting bug report, please use
Affect Version to indicate which version you have tested for your bug.
Leave the fixVersion empty. It creates lot of confusion while generating
release notes as to whats in the release and whats not.

Thanks,
Ashutosh


Re: [VOTE] Apache Hive 0.9.0 Release Candidate 1

2012-04-24 Thread Ashutosh Chauhan
Unfortunately, there is a small problem in RC1 that version string in
build.properties is 0.9.0-SNAPSHOT, instead of 0.9.0. So, I have to rescind
this vote. I will respin the RC2 shortly.

Thanks,
Ashutosh

On Mon, Apr 23, 2012 at 10:47, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hey all,

 Apache Hive 0.9.0 Release Candidate 1 is available here:
 http://people.apache.org/~hashutosh/hive-0.9.0-rc1/

 Maven artifacts are available here:
 https://repository.apache.org/content/repositories/orgapachehive-084/

 Change List is available here:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

 Voting will conclude in 72 hours.
 Hive PMC Members: Please test and vote.

 Thanks,
 Ashutosh



[VOTE] Apache Hive 0.9.0 Release Candidate 2

2012-04-24 Thread Ashutosh Chauhan
Hey all,

Apache Hive 0.9.0 Release Candidate 2 is available here:
http://people.apache.org/~hashutosh/hive-0.9.0-rc2/

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-094/

Change List is available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

Voting will conclude in 72 hours.
Hive PMC Members: Please test and vote.

Thanks,
Ashutosh


Re: [VOTE] Apache Hive 0.9.0 Release Candidate 2

2012-04-24 Thread Ashutosh Chauhan
Downloaded the bits. Installed on 5 node cluster.
Did create table. Ran basic queries. Ran unit tests. All looks good.

+1

Thanks,
Ashutosh

On Tue, Apr 24, 2012 at 12:29, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hey all,

 Apache Hive 0.9.0 Release Candidate 2 is available here:
 http://people.apache.org/~hashutosh/hive-0.9.0-rc2/

 Maven artifacts are available here:
 https://repository.apache.org/content/repositories/orgapachehive-094/

 Change List is available here:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

 Voting will conclude in 72 hours.
 Hive PMC Members: Please test and vote.

 Thanks,
 Ashutosh




Re: [VOTE] Apache Hive 0.9.0 Release Candidate 2

2012-04-27 Thread Ashutosh Chauhan
RC2 vote is closed now. I am excited to report that vote has passed. I will
send out the note about availability of 0.9 bits once I finish publishing
it.

Thanks for all those who tested/voted the release.

Ashutosh

On Tue, Apr 24, 2012 at 12:29, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hey all,

 Apache Hive 0.9.0 Release Candidate 2 is available here:
 http://people.apache.org/~hashutosh/hive-0.9.0-rc2/

 Maven artifacts are available here:
 https://repository.apache.org/content/repositories/orgapachehive-094/

 Change List is available here:

 https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

 Voting will conclude in 72 hours.
 Hive PMC Members: Please test and vote.

 Thanks,
 Ashutosh




[ANNOUNCE] Apache Hive 0.9.0 Released

2012-04-30 Thread Ashutosh Chauhan
The Apache Hive team is proud to announce the the release of Apache
Hive version 0.9.0.

The Apache Hive (TM) data warehouse software facilitates querying and
managing large datasets residing in distributed storage. Built on top
of Apache Hadoop (TM), it provides:

* Tools to enable easy data extract/transform/load (ETL)

* A mechanism to impose structure on a variety of data formats

* Access to files stored either directly in Apache HDFS (TM) or in other
  data storage systems such as Apache HBase (TM)

* Query execution via MapReduce

For Hive release details and downloads, please visit:
http://hive.apache.org/releases.html

Hive 0.9.0 Release Notes are available here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310843version=12317742

We would like to thank the many contributors who made this release possible.

Regards,
The Apache Hive Team


Re: Branch 0.9 maven snapshots?

2012-04-30 Thread Ashutosh Chauhan
Hey Carl,

Build is reported to be failing, though all tests are passing. Any ideas?

Thanks,
Ashutosh

On Wed, Apr 25, 2012 at 13:40, Thomas Weise t...@yahoo-inc.com wrote:

 Thanks Carl!


 On 4/25/12 1:12 PM, Carl Steinbach c...@cloudera.com wrote:

  Hi Thomas,
 
  I created the job:
  https://builds.apache.org/job/Hive-0.9.0-SNAPSHOT-h0.21/
 
  Thanks.
 
  Carl
 
  On Tue, Apr 24, 2012 at 6:48 PM, Thomas Weise t...@yahoo-inc.com wrote:
 
  Looks like there is no maven snapshots build setup for the 0.9 branch
 yet.
 
  Can we have this setup similar to what is available for the 0.8 branch?
 Or
  switch that to build 0.9 instead?
 
  Thanks,
  Thomas
 




Re: Problems with Arc/Phabricator

2012-05-08 Thread Ashutosh Chauhan
Made some progress on using arc/phab on ubuntu. epriestley helped a ton
over at #phabricator irc channel. Thanks, Evan!
Now, able to make arc work on ubuntu, but seems like jira integration is
broken. Hit the following problem:

$arc diff —jira HIVE-3008

PHP Fatal error:  Class 'ArcanistDifferentialRevisionRef' not found in
/home/ashutosh/workspace/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php on
line 201

Fatal error: Class 'ArcanistDifferentialRevisionRef' not found in
/home/ashutosh/workspace/.arc_jira_lib/arcanist/ArcJIRAConfiguration.php on
line 201

Even with this error diff did get generated but it was not posted back on
jira. Evan is working on a patch to fix this.

He is also discussing with Facebook folks on how to tackle these issues in
long term. Discussion is going on at https://secure.phabricator.com/T1206

I will request people who are actively working on Hive to follow the
discussion on this ticket.


Thanks,

Ashutosh



On Thu, Apr 19, 2012 at 5:24 PM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Problem while using arc on ubuntu

 $ arc patch D2871
 ARC: Cannot mix P and A
 UNIX: No such file or directory

 Any ideas whats up there.

 Thanks,
 Ashutosh

 On Thu, Apr 19, 2012 at 17:19, Edward Capriolo edlinuxg...@gmail.comwrote:

 Just throwing this out there. The phabricator IRC has more people and
 is usually more active then Hive IRC.

 #JustSaying...

 On Thu, Apr 19, 2012 at 7:35 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:
  Hit a new problem with arc today:
 
  Fatal error: Uncaught exception 'Exception' with message 'Host returned
  HTTP/200, but invalid JSON data in response to a Conduit method call:
  br /
  bWarning/b:  Unknown: POST Content-Length of 9079953 bytes exceeds
 the
  limit of 8388608 bytes in bUnknown/b on line b0/bbr /
 
 for(;;);{result:null,error_code:ERR-INVALID-SESSION,error_info:Session
  key is not present.}' in
 
 /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php:48
  Stack trace:
  #0
 
 /Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(62):
  ConduitFuture-didReceiveResult(Array)
  #1
 
 /Users/ashutosh/work/hive/libphutil/src/future/proxy/FutureProxy.php(39):
  FutureProxy-getResult()
  #2
 
 /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitClient.php(52):
  FutureProxy-resolve()
  #3
 
 /Users/ashutosh/work/hive/arcanist/src/workflow/diff/ArcanistDiffWorkflow.php(341):
  ConduitClient-callMethodSynchronous('differential.cr...', Array)
  #4 /Users/ashutosh/work/hive/arcanist/scripts/arcanist.php(266):
  ArcanistDiffWo in
 
 /Users/ashutosh/work/hive/libphutil/src/conduit/client/ConduitFuture.php on
  line 48
 
 
  Any ideas how to solve this?
 
  Thanks,
  Ashutosh
 
  On Wed, Apr 11, 2012 at 18:37, Edward Capriolo edlinuxg...@gmail.com
 wrote:
 
  I think the most practical solution is try and use arc/phab and then
  if there is a problem fall back to Jira and do it the old way.
 
  Edward
 
  On Wed, Apr 11, 2012 at 7:17 PM, Carl Steinbach c...@cloudera.com
 wrote:
   +1 to switching over to Git.
  
   As for the rest of the Phabricator/Gerrit/Reviewboard discussion, I
 think
   we should pick this up again at the contributor meeting on Wednesday.
  
   Thanks.
  
   Carl
  
   On Wed, Apr 11, 2012 at 12:19 PM, Ashutosh Chauhan 
 hashut...@apache.org
  wrote:
  
   +1 on moving away from arc/phabricator. It works great when it
 works,
  but
   most of the time it doesnt work.
  
   Ashutosh
  
   On Wed, Apr 11, 2012 at 11:57, Owen O'Malley omal...@apache.org
  wrote:
  
On Wed, Apr 11, 2012 at 11:48 AM, Edward Capriolo 
  edlinuxg...@gmail.com
   
wrote:
 If we are going to switch from fabricator we just might as well
 go
 back to not using anything. Review board was really clunky and
 confusing.
   
I'm mostly +1 to that. If no one is supporting phabricator, then
 it
won't work for long. Personally, I'd love it if we could move
 Hive to
git completely. Has anyone used gerrit? The videos of it make it
 look
better than sliced bread.
   
-- Owen
   
  
 





Re: Problems with Arc/Phabricator

2012-05-09 Thread Ashutosh Chauhan
Doesn't work for me either. I see error message.

Ashutosh

On Wed, May 9, 2012 at 12:24 AM, Carl Steinbach c...@cloudera.com wrote:

 Actually, I take that back. After logging in I'm now back to the original
 error message.

 On Wed, May 9, 2012 at 12:22 AM, Carl Steinbach c...@cloudera.com wrote:

  Hi John,
 
  Thanks for checking. I got the page to load again after clearing my
  browser's cache.
 
  Carl
 
 
  On Wed, May 9, 2012 at 12:17 AM, John Sichi jsi...@gmail.com wrote:
 
  Regarding the reviews.facebook.net website, I tried just now and it
  seems to be working for me; here's a screenshot of what I get for
  https://reviews.facebook.net/D3075:
 
  http://i.imgur.com/umHlB.png
 
  JVS
 
 
 



Re: new feature in hive: links

2012-05-22 Thread Ashutosh Chauhan
To kickstart the review, I did a quick review of the doc. Few questions
popped out to me, which I asked. Sambavi was kind enough to come back with
replies for them. I am continuing to look into it. Will encourage other
folks to look into it as well.


Thanks,

Ashutosh


Begin Forward Message


Hi Ashutosh

** **

Thanks for looking through the design and providing your feedback!

** **

Responses below:

* What exactly is contained in tracking capacity usage. One is disk space.
That I presume you are going to track via summing size under database
directory. Are you also thinking of tracking resource usage in terms of
CPU/memory/network utilization for different teams? 

Right now the capacity usage in Hive we will track is the disk space
(managed tables that belong to the namespace + imported tables). We will
track the mappers and reducers that the namepace utilizes directly from
Hadoop.

** **

* Each namespace (ns) will have exactly one database. If so, then users are
not allowed to create/use databases in such deployment? Not necessarily a
problem, just trying to understand design.

Yes, you are correct – this is a limitation of the design. Introducing a
new concept seemed heavyweight, so you can instead think of this as
“self-contained” databases. But it means that a given namespace cannot have
sub-databases in it.

** **

* How are you going to keep metadata consistent across two ns? If metadata
gets updated in remote ns, will it get automatically updated in user's
local ns? If yes, how will this be implemented? If no, then every time user
need to use data from remote ns, she has to bring metadata uptodate in her
ns. How will she do it?

Metadata will be kept in sync for linked tables. We will make alter table
on the remote table (source of the link) cause an update to the target of
the link. Note that from a Hive perspective, the metadata for the source
and target of a link is in the same metastore.

** **

* Is it even required that metadata of two linked tables to be consistent?
Seems like user has to run alter link add partition herself for each
partition. She can choose only to add few partitions. In this case, tables
in two ns have different number of partitions and thus data.

What you say above is true for static links. For dynamic links, add and
drop partition on the source of the link will cause the target to get those
partitions as well (we trap alter table add/drop partition to provide this
behavior).

** **

* Who is allowed to create links?

Any user on the database who has create/all privileges on the database. We
could potentially create a new privilege for this, but I think create
privilege should suffice. We can similarly map alter, drop privileges to
the appropriate operations.

** **

* Once user creates a link, who can use it? If everyone is allowed to
access, then I don't see how is it different from the problem that you are
outlining in first alternative design option, wherein user having an access
to two ns via roles has access to data on both ns.

The link creates metadata in the target database. So you can only access
data that has been linked into this database (access is via the T@Y or Y.T
syntax depending on the chosen design option). Note that this is different
than having a role that a user maps to since in that case, there is no
local metadata in the target database specifying if the imported data is
accessible from this database.

** **

* If links are first class concepts, then authorization model also needs to
understand them? I don't see any mention of that.

Yes, you are correct. We need to account for the authorization model.

** **

* I see there is a hdfs jira for implementing hard links of files in hdfs
layer, so that takes care of linking physical data on hdfs. What about
tables whose data is stored in external systems. For example, hbase. Does
hbase also needs to implement feature of hard-linking their table for hive
to make use of this feature? What about other storage handlers like
cassandra, mongodb etc.

The link does not create a link on HDFS. It just points to the source
table/partitions. You can think of it as a Hive-level link so there is no
need for any changes/features from the other storage handlers.

** **

* Migration will involve two step process of distcp'ing data from one
cluster to another and then replicating one mysql instance to another. Are
there any other steps? Do you plan to (later) build tools to automate this
process of migration.

Yes, we will be building tools to enable migration of a namespace.
Migration will involve replicating the metadata and the data as you mention
above.

** **

* When migrating ns from one datacenter to another, will links be dropped
or they are also preserved? 

We will preserve them – by copying the data for the links to the other
datacenter.

** **

Hope that helps. Please ask any more questions that come up as 

Re: non-string partition columns

2012-05-25 Thread Ashutosh Chauhan
Some discussion for this has happened on
https://issues.apache.org/jira/browse/HIVE-2702 Is the underlying problem
same as the one which I described on that jira ?

Thanks,
Ashutosh

On Thu, May 24, 2012 at 10:59 PM, Namit Jain nj...@fb.com wrote:

 Should we disallow non-string partition columns completely ?
 Does anyone depend on that ?


 On 5/24/12 6:49 PM, Namit Jain nj...@fb.com wrote:

 
 http://svn.apache.org/viewvc?view=revisionrevision=1308427
 
 The patch above broke drop partitions if the partition happens to be
 non-string.
 This is due to a JDO issue with non-string columns.
 
 Is anyone using non-string partition columns ?
 Should be force the partition columns to be only of type string ?
 The documentation probably does not specify anything clearly.
 
 If someone is dependent on non-string partition column, we need to revert
 this patch, or make a
 special case for string partition columns.
 
 Thanks,
 -namit
 
 




Re: non-string partition columns

2012-05-29 Thread Ashutosh Chauhan
FWIW.. HCatalog only allows partition columns of type string precisely
because in backend datastore type information is not recorded. In my
opinion, partition type should be restricted to type string until we fix
this problem, otherwise it gives unexpected behavior to endusers and/or
bug-reps. One possibility is to introduce config variable
hive.partition.column.type and has it value set to string by default.
This ensures that new users get expected behavior of string-only partition
columns. Users who already use other types can reset this config value to
all in their deployment when they upgrade to newer version of Hive
(assuming new version comes out without a proper fix). This extra step of
reseting default config will help them to understand the risk they are
taking by changing default value.

Thanks,
Ashutosh

On Tue, May 29, 2012 at 10:02 AM, Namit Jain nj...@fb.com wrote:

 OK, I will keep the support.
 Add special casing for string columns in DDLTask

 On 5/29/12 9:27 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 We use them to we store our dates as integers like 20120130. This
 allows us to do partition pruning with ranges.
 
 On Tue, May 29, 2012 at 4:10 AM, Aniket Mokashi aniket...@gmail.com
 wrote:
  We are using non-string partition columns in production as well.
 
  Thanks,
  Aniket
 
  On Sat, May 26, 2012 at 1:20 AM, Philip Tromans
  philip.j.trom...@gmail.comwrote:
 
  We're using non-string partition columns in production. I think non
 string
  partition columns are a good thing to have - it allows you to do all
 sorts
  of date range calculations etc. AFAIK, MySQL's partition columns can
 be of
  any type.
 
  Phil.
  On May 26, 2012 7:55 AM, Namit Jain nj...@fb.com wrote:
 
   Should I go ahead and file a jira to disallow non-string partition
  columns
   ?
   Or, someone depends on that functionality.
  
  
   On 5/25/12 10:01 AM, Namit Jain nj...@fb.com wrote:
  
   Yes, but the meta-question is:
   
   Is anyone dependent on non-string partition columns ? Should we
 drop the
   support for non-string
   partition columns ?
   
   
   Thanks,
   -namit
   
   On 5/24/12 11:21 PM, Ashutosh Chauhan hashut...@apache.org
 wrote:
   
   Some discussion for this has happened on
   https://issues.apache.org/jira/browse/HIVE-2702 Is the underlying
   problem
   same as the one which I described on that jira ?
   
   Thanks,
   Ashutosh
   
   On Thu, May 24, 2012 at 10:59 PM, Namit Jain nj...@fb.com wrote:
   
Should we disallow non-string partition columns completely ?
Does anyone depend on that ?
   
   
On 5/24/12 6:49 PM, Namit Jain nj...@fb.com wrote:
   

http://svn.apache.org/viewvc?view=revisionrevision=1308427

The patch above broke drop partitions if the partition happens
 to be
non-string.
This is due to a JDO issue with non-string columns.

Is anyone using non-string partition columns ?
Should be force the partition columns to be only of type string
 ?
The documentation probably does not specify anything clearly.

If someone is dependent on non-string partition column, we need
 to
   revert
this patch, or make a
special case for string partition columns.

Thanks,
-namit


   
   
   
  
  
 
 
 
 
  --
  ...:::Aniket:::... Quetzalco@tl




Re: Behavior of Hive 2837: insert into external tables should not be allowed

2012-06-01 Thread Ashutosh Chauhan
Hi Mark,

I understand your concern w.r.t backward compatibility. But as Ed pointed
out there is a config variable and by default semantic is unchanged so you
can continue to insert into your external table.
I have a question though. Why are you creating all your tables as
external tables ? Why not regular tables?

Thanks,
Ashutosh

On Thu, May 31, 2012 at 9:35 PM, Mark Grover grover.markgro...@gmail.comwrote:

 Hi folks,
 I have a question regarding HIVE 2837(
 https://issues.apache.org/jira/browse/HIVE-2837) that deals with
 disallowing external table from using insert into queries.

 From looking at the JIRA, it seems like it applies to external tables on
 HDFS as well. Technically, insert into should be ok for external tables on
 HDFS (and S3 as well). Seems like a storage file system level thing to
 specify whether insert into is applied and implement it.

 Historically, there hasn't been any real difference between creating an
 external table on HDFS vs creating a managed one. However, if we disallow
 insert into on external tables, that would mean that folks with external
 tables on HDFS wouldn't be able to make use of insert into functionality
 even though they should be able to. Do we want to allow insert into on HDFS
 tables regardless of whether they are external or not?

 Mark



Re: test errors

2012-06-11 Thread Ashutosh Chauhan
Works for me.

$ svn up
svn At revision 1348932.
$ svn st
$ ant clean package test -Dtestcase=TestZooKeeperTokenStore

 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.783 sec

Any logs to look at?

Ashutosh
On Mon, Jun 11, 2012 at 5:33 AM, Namit Jain nj...@fb.com wrote:

 I am seeing the following errors on a fresh hive trunk ?

[junit] Running org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
 FAILED (crashed)


 Is anyone else getting the same error ?

 Thanks,
 -namit




Re: Review Request: Allow to download resources from any external File Systems to local machine.

2012-07-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5687/#review8776
---



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
https://reviews.apache.org/r/5687/#comment18552

Instead of regex, it might be better to use URI to parse the string.
String scheme = new Path(value).toURI().getScheme();
return (scheme != null)  !scheme.equalsIgnoreCase(file);


- Ashutosh Chauhan


On June 30, 2012, 6:15 p.m., Kanna Karanam wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/5687/
 ---
 
 (Updated June 30, 2012, 6:15 p.m.)
 
 
 Review request for hive, Carl Steinbach, Edward  Capriolo, and Ashutosh 
 Chauhan.
 
 
 Description
 ---
 
 Instead of restricting resources download to s3, s3n, hdfs make it open 
 for any external file systems.
 
 
 This addresses bug HIVE-3146.
 https://issues.apache.org/jira/browse/HIVE-3146
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  1355510 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
  1355510 
 
 Diff: https://reviews.apache.org/r/5687/diff/
 
 
 Testing
 ---
 
 Yes. All unit tests passed.
 
 
 Thanks,
 
 Kanna Karanam
 




Re: Review Request: Resource Leak: Fix the File handle leak in EximUtil.java

2012-07-10 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5777/#review9021
---

Ship it!


Ship It!


http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java
https://reviews.apache.org/r/5777/#comment19184

path here is used to construct URI object later. URI constructor javadocs 
says that path should either begin with '/' or should be empty, irrespective of 
OS. So, looks like check for path.startsWith('/') makes most sense here. As 
Kanna  noted his changes make path to /D:/hive/etc from D:/hive/etc thus making 
URI constructor happy.


- Ashutosh Chauhan


On July 5, 2012, 7:50 p.m., Kanna Karanam wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/5777/
 ---
 
 (Updated July 5, 2012, 7:50 p.m.)
 
 
 Review request for hive, Carl Steinbach, Edward  Capriolo, and Ashutosh 
 Chauhan.
 
 
 Description
 ---
 
 1) Not closing the file handle EximUtil after reading the metadata from the 
 file.
 2) Nit: Get the path from URI to handle the Windows paths.
 
 
 This addresses bug HIVE-3232.
 https://issues.apache.org/jira/browse/HIVE-3232
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java
  1357818 
 
 Diff: https://reviews.apache.org/r/5777/diff/
 
 
 Testing
 ---
 
 Yes
 
 
 Thanks,
 
 Kanna Karanam
 




Re: Review Request: Remove the Unix specific absolute path of “Cat” utility in several .q files to make them run on Windows with CygWin in path.

2012-08-05 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6313/#review9874
---

Ship it!


Ship It!

- Ashutosh Chauhan


On Aug. 2, 2012, 4:51 a.m., Kanna Karanam wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/6313/
 ---
 
 (Updated Aug. 2, 2012, 4:51 a.m.)
 
 
 Review request for hive, Carl Steinbach, Edward  Capriolo, and Ashutosh 
 Chauhan.
 
 
 Description
 ---
 
 Several .q files have Unix absolute paths for Cat utility so all of them are 
 failing on Windows even with CygWin support. 
 
 
 This addresses bug HIVE-3327.
 https://issues.apache.org/jira/browse/HIVE-3327
 
 
 Diffs
 -
 
   trunk/contrib/src/test/queries/clientpositive/serde_typedbytes.q 1368192 
   trunk/contrib/src/test/queries/clientpositive/serde_typedbytes2.q 1368192 
   trunk/contrib/src/test/queries/clientpositive/serde_typedbytes3.q 1368192 
   trunk/contrib/src/test/queries/clientpositive/serde_typedbytes4.q 1368192 
   trunk/contrib/src/test/results/clientpositive/serde_typedbytes.q.out 
 1368192 
   trunk/contrib/src/test/results/clientpositive/serde_typedbytes2.q.out 
 1368192 
   trunk/contrib/src/test/results/clientpositive/serde_typedbytes3.q.out 
 1368192 
   trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 
 1368192 
   trunk/ql/src/test/queries/clientnegative/clusterbydistributeby.q 1368192 
   trunk/ql/src/test/queries/clientnegative/clusterbyorderby.q 1368192 
   trunk/ql/src/test/queries/clientnegative/clusterbysortby.q 1368192 
   trunk/ql/src/test/queries/clientnegative/orderbysortby.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input14.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input14_limit.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input17.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input18.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input34.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input35.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input36.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input38.q 1368192 
   trunk/ql/src/test/queries/clientpositive/input5.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce1.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce2.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce3.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce4.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce7.q 1368192 
   trunk/ql/src/test/queries/clientpositive/mapreduce8.q 1368192 
   trunk/ql/src/test/queries/clientpositive/newline.q 1368192 
   trunk/ql/src/test/queries/clientpositive/nullscript.q 1368192 
   trunk/ql/src/test/queries/clientpositive/partcols1.q 1368192 
   trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1368192 
   trunk/ql/src/test/queries/clientpositive/query_with_semi.q 1368192 
   trunk/ql/src/test/queries/clientpositive/regexp_extract.q 1368192 
   trunk/ql/src/test/queries/clientpositive/select_transform_hint.q 1368192 
   trunk/ql/src/test/queries/clientpositive/transform_ppr1.q 1368192 
   trunk/ql/src/test/queries/clientpositive/transform_ppr2.q 1368192 
   trunk/ql/src/test/results/clientpositive/input14.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input14_limit.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input17.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input18.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input34.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input35.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input36.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input38.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/input5.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce1.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce2.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce3.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce4.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce7.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/mapreduce8.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/newline.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/nullscript.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/partcols1.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/ppd_transform.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/query_with_semi.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/regexp_extract.q.out 1368192 
   trunk/ql/src/test/results/clientpositive/select_transform_hint.q.out 
 1368192 
   trunk

Re: Review Request: This function overloads the current DateDiff(expr1, expr2) by adding another parameter to specify the units.

2012-08-06 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6027/#review9924
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21120

Instead of these enums, can we use these ints instead 
http://docs.oracle.com/javase/6/docs/api/constant-values.html#java.text.DateFormat
 ?

Also, I don't think microseconds make sense, we don't have that precision 
in any case.



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21121

Lets get rid of formatter variable, add default format (-MM-dd) as 
first format in dateFormats and use formatLong() for all formats  



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21126

Instead of doing instanceOf later on, use toDate() / toTimeStamp() 
depending on unit here itself. Then, have  evalutateObj(Date, Date).



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21122

Avoid unnecessary object creation. Do, Date date1 = resolveDate(dateObj1, 
unit) which is more appropriate. Similarly for date2.



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21118

Looks like this function is not used anywhere. Please remove it.



trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java
https://reviews.apache.org/r/6027/#comment21119

Looks like this function is not used anywhere. Please, remove it.


- Ashutosh Chauhan


On July 18, 2012, 12:56 a.m., Shefali Vohra wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/6027/
 ---
 
 (Updated July 18, 2012, 12:56 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Parameters
  This function overloads the current DateDiff(expr1, expr2) by adding another 
 parameter to specify the units. It takes 3 parameters. The first two are 
 timestamps, and the formats accepted are:
  -MM-dd
  -MM-dd HH:mm:ss
  -MM-dd HH:mm:ss.milli
 
 These are the formats accepted by the current DateDiff(expr1, expr2) function 
 and allow for that consistency. The accepted data types for the timestamp 
 will be Text, TimestampWritable, Date, and String, just as with the already 
 existing function.
 
 The third parameter is the units the user wants the response to be in. 
 Acceptable units are:
  Microsecond
  Millisecond
  Second
  Minute
  Hour
  Day
  Week
  Month
  Quarter
  Year
 
 When calculating the difference, the full timestamp is used when the 
 specified unit is hour or smaller (microsecond, millisecond, second, minute, 
 hour), and only the date part is used if the unit is day or larger (day, 
 week, month, quarter, year). If for the smaller units the time is not 
 specified and the format -MM-dd is used, the time 00:00:00.0 is used. 
 Leap years are accounted for by the Calendar class in Java, which inherently 
 addresses the issue.
 
 The assumption is made that all these time parameters are in the same time 
 zone.
 
 Return Value
  The function returns expr1 - expr2 expressed as an int in the units 
 specified.
 
 Hive vs. SQL
  SQL also has a DateDiff() function with some more acceptable units. The 
 order of parameters is different between SQL and Hive. The reason for this is 
 that Hive already has a DateDiff() function with the same first two 
 parameters, and having this order here allows for that consistency within 
 Hive.
 
 Example Query
  hive  DATEDIFF(DATE_FIELD, '2012-06-01', ‘day’); 
 
 Diagnostic Error Messages
  Invalid table alias or column name reference
  Table not found
 
 
 This addresses bug HIVE-3216.
 https://issues.apache.org/jira/browse/HIVE-3216
 
 
 Diffs
 -
 
   trunk/data/files/datetable.txt PRE-CREATION 
   trunk/data/files/timestamptable.txt PRE-CREATION 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDateDiff.java 1362724 
   trunk/ql/src/test/queries/clientnegative/udf_datediff.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/udf_datediff.q 1362724 
   trunk/ql/src/test/results/clientnegative/udf_datediff.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/udf_datediff.q.out 1362724 
 
 Diff: https://reviews.apache.org/r/6027/diff/
 
 
 Testing
 ---
 
 positive and negative test cases included
 
 
 Thanks,
 
 Shefali Vohra
 




Re: [ANNOUNCE] New Hive Committer - Navis Ryu

2012-08-10 Thread Ashutosh Chauhan
Congrats, Navis! Well deserved. Welcome, aboard!

Ashutosh
On Fri, Aug 10, 2012 at 11:10 AM, Carl Steinbach c...@cloudera.com wrote:

 Congratulations Navis! This is very well deserved. Looking forward to many
 more patches from you.

 On Fri, Aug 10, 2012 at 8:10 AM, Bejoy KS bejoy...@yahoo.com wrote:

  Congrats Navis.. :)
 
  Regards
  Bejoy KS
 
  Sent from handheld, please excuse typos.
 
  -Original Message-
  From: alo alt wget.n...@gmail.com
  Date: Fri, 10 Aug 2012 17:08:07
  To: u...@hive.apache.org
  Reply-To: u...@hive.apache.org
  Cc: dev@hive.apache.org; navis@nexr.com
  Subject: Re: [ANNOUNCE] New Hive Committer - Navis Ryu
 
  Congratulations! Well done :)
 
  cheers,
   ALex
 
  On Aug 10, 2012, at 11:58 AM, John Sichi jsi...@gmail.com wrote:
 
   The Apache Hive PMC has passed a vote to make Navis Ryu a new
   committer on the project.
  
   JIRA is currently down, so I can't send out a link with his
   contribution list at the moment, but if you have an account at
   reviews.facebook.net, you can see his activity here:
  
   https://reviews.facebook.net/p/navis/
  
   Navis, please submit your CLA to the Apache Software Foundation as
   described here:
  
   http://www.apache.org/licenses/#clas
  
   Congratulations!
   JVS
 
 
  --
  Alexander Alten-Lorenz
  http://mapredit.blogspot.com
  German Hadoop LinkedIn Group: http://goo.gl/N8pCF
 
 



Re: Review Request: HIVE-3409. Increase test.timeout value

2012-08-27 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6780/#review10775
---

Ship it!


Ship It!

- Ashutosh Chauhan


On Aug. 27, 2012, 10:09 a.m., Carl Steinbach wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/6780/
 ---
 
 (Updated Aug. 27, 2012, 10:09 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 commit bfb69f3cb607a2daadcff07dd87b8924ae19ae2b
 Author: Carl Steinbach c...@cloudera.com
 Date:   Mon Aug 27 03:05:19 2012 -0700
 
 Add test.junit.timeout to build.properties, and double value
 
  build-common.xml | 3 +--
  build.properties | 6 ++
  common/build.xml | 2 +-
  3 files changed, 8 insertions(+), 3 deletions(-)
 
 
 This addresses bug HIVE-3409.
 https://issues.apache.org/jira/browse/HIVE-3409
 
 
 Diffs
 -
 
   build-common.xml f2697e1 
   build.properties ff9eba9 
   common/build.xml 2712c03 
 
 Diff: https://reviews.apache.org/r/6780/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Carl Steinbach
 




Re: Review Request: HIVE-3409. Increase test.timeout value

2012-08-27 Thread Ashutosh Chauhan


 On Aug. 27, 2012, 3:07 p.m., Ashutosh Chauhan wrote:
  Ship It!

+1 Please commit.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6780/#review10775
---


On Aug. 27, 2012, 10:09 a.m., Carl Steinbach wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/6780/
 ---
 
 (Updated Aug. 27, 2012, 10:09 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 commit bfb69f3cb607a2daadcff07dd87b8924ae19ae2b
 Author: Carl Steinbach c...@cloudera.com
 Date:   Mon Aug 27 03:05:19 2012 -0700
 
 Add test.junit.timeout to build.properties, and double value
 
  build-common.xml | 3 +--
  build.properties | 6 ++
  common/build.xml | 2 +-
  3 files changed, 8 insertions(+), 3 deletions(-)
 
 
 This addresses bug HIVE-3409.
 https://issues.apache.org/jira/browse/HIVE-3409
 
 
 Diffs
 -
 
   build-common.xml f2697e1 
   build.properties ff9eba9 
   common/build.xml 2712c03 
 
 Diff: https://reviews.apache.org/r/6780/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Carl Steinbach
 




Re: [VOTE] Apache Hive 0.11.0 Release Candidate 2

2013-05-15 Thread Ashutosh Chauhan
+1

Built from sources, ran few unit tests and some simple queries against
1-node cluster.

Thanks,
Ashutosh


On Wed, May 15, 2013 at 5:30 PM, Navis류승우 navis@nexr.com wrote:

 +1

 - built from source, passed all tests (without assertion fail or
 conversion to backup task)
 - working good with queries from running-site
 - some complaints on missing HIVE-4172 (void type for JDBC2)

 Thanks

 2013/5/12 Owen O'Malley omal...@apache.org:
  Based on feedback from everyone, I have respun release candidate, RC2.
  Please take a look. We've fixed 7 problems with the previous RC:
  * Release notes were incorrect
   * HIVE-4018 - MapJoin failing with Distributed Cache error
   * HIVE-4421 - Improve memory usage by ORC dictionaries
   * HIVE-4500 - Ensure that HiveServer 2 closes log files.
   * HIVE-4494 - ORC map columns get class cast exception in some contexts
   * HIVE-4498 - Fix TestBeeLineWithArgs failure
   * HIVE-4505 - Hive can't load transforms with remote scripts
   * HIVE-4527 - Fix the eclipse template
 
  Source tag for RC2 is at:
 
  https://svn.apache.org/repos/asf/hive/tags/release-0.11.0rc2
 
 
  Source tar ball and convenience binary artifacts can be found
  at: http://people.apache.org/~omalley/hive-0.11.0rc2/
 
  This release has many goodies including HiveServer2, integrated
  hcatalog, windowing and analytical functions, decimal data type,
  better query planning, performance enhancements and various bug fixes.
  In total, we resolved more than 350 issues. Full list of fixed issues
  can be found at:  http://s.apache.org/8Fr
 
 
  Voting will conclude in 72 hours.
 
  Hive PMC Members: Please test and vote.
 
  Thanks,
 
  Owen



Re: Review Request: HIVE-4489: beeline always return the same error message twice

2013-06-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10917/#review21291
---

Ship it!


+1

- Ashutosh Chauhan


On May 3, 2013, 11:01 p.m., Chaoyu Tang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/10917/
 ---
 
 (Updated May 3, 2013, 11:01 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 Beeline always returns the same error message twice -- because the error is 
 logged out both in an exception catch block and its outer re-catch block.
 
 
 This addresses bug HIVE-4489.
 https://issues.apache.org/jira/browse/HIVE-4489
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/Commands.java 8e2a52f 
 
 Diff: https://reviews.apache.org/r/10917/diff/
 
 
 Testing
 ---
 
 Have done the tests.
 
 
 Thanks,
 
 Chaoyu Tang
 




Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21293
---


Request for comments.


http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
https://reviews.apache.org/r/11172/#comment44172

Can you add a comment about why we need to set it MAX value instead of 0, 
since its not apparent?



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
https://reviews.apache.org/r/11172/#comment44173

Similarly can you add a comment why we need to set max to -ve infinity and 
not 0, since its counter intuitive?


- Ashutosh Chauhan


On May 15, 2013, 7:11 a.m., Zhuoluo Yang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11172/
 ---
 
 (Updated May 15, 2013, 7:11 a.m.)
 
 
 Review request for hive, Carl Steinbach, Carl Steinbach, and fangkun cao.
 
 
 Description
 ---
 
 An initialization error.
 Make double and long initialize correctly.
 Would you review that and assign the issue to me?
 
 
 This addresses bug HIVE-4561.
 https://issues.apache.org/jira/browse/HIVE-4561
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
  1482697 
 
 Diff: https://reviews.apache.org/r/11172/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Zhuoluo Yang
 




Re: Review Request: HIVE-4513 - disable hivehistory logs by default

2013-06-03 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review21352
---



data/conf/hive-site.xml
https://reviews.apache.org/r/11029/#comment44263

Is there a reason for this to be set to true for tests? Unless there is, we 
should set config in tests to the default values, since we should test default 
configs.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44264

doesn't read right. I guess you wanted 
... statistics into a file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44266

This is existing comment which doesnt read right. But since we are doing 
major surgery on HiveHistory, it will be good to update to make it more 
sensible.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44268

I think word job is not required in this comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44269

I think query is a better word than job here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44270

Better worded as
Called at the end of query.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44271

Again use of word job is confusing, we shall use query here as well.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44272

Incorrect comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44274

Function name is IdtoTable, but comment says table to id. One of this needs 
to be corrected.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44275

Similar comment as in HiveHistory.java



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44277

Should this be hive.ql.exec.HiveHistoryImpl to avoid confusion?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44278

and instead of an ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44280

In case of incorrect config, should this throw an exception instead of 
silent return, otherwise there will be errors later when something is tried to 
be written in history file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44281

Same comment as above.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44283

This should be static class variable, otherwise nextInt() will return same 
value for each invocation.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44284

Instead of / we shall use File.Seprator



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44287

Consider using File.createNewFile here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44288

Use  System.getProperty(line.separator) instead of \n



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44289

start of query ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java
https://reviews.apache.org/r/11029/#comment44291

Missing apache header



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java
https://reviews.apache.org/r/11029/#comment44292

HiveHistoryViewer.class


- Ashutosh Chauhan


On May 13, 2013, 10:12 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11029/
 ---
 
 (Updated May 13, 2013, 10:12 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs

Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/#review21366
---


Patch looks good, apart from one comment.


ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
https://reviews.apache.org/r/11335/#comment44301

Apart from this change, all other changes are contained within if(isLocal) 
block. Because of this it seems its possible it might be triggered for 
non-local paths as well. Can you test it for hdfs:// path which has spaces. If 
its easy, it will be good to add it in test, else manual test is fine as well.


- Ashutosh Chauhan


On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11335/
 ---
 
 (Updated June 3, 2013, 10:18 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Patch includes fix and new test case.
 
 
 This addresses bug HIVE-4554.
 https://issues.apache.org/jira/browse/HIVE-4554
 
 
 Diffs
 -
 
   data/files/person PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
 bd8d252 
   ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11335/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Xuefu Zhang
 




Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-04 Thread Ashutosh Chauhan


 On June 3, 2013, 11:15 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java, line 
  273
  https://reviews.apache.org/r/11335/diff/3/?file=300140#file300140line273
 
  Apart from this change, all other changes are contained within 
  if(isLocal) block. Because of this it seems its possible it might be 
  triggered for non-local paths as well. Can you test it for hdfs:// path 
  which has spaces. If its easy, it will be good to add it in test, else 
  manual test is fine as well.
 
 Xuefu Zhang wrote:
 I tried to add a testcase loading file at HDFS into a table without a 
 success. Doing this requires an HDFS accessible from the test machine. Please 
 let me know if you think there is mechanism. However, I did manually test the 
 case, and it works fine for me. (It fails w/o the patch.)

Glad that its working. You can add this test-case for MinmrCliDriver . Just 
write a regular .q test file and then include that within minimr.query.files 
parameter in build-common.xml . Those testcases run against minicluster so you 
can access hdfs:// there.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/#review21366
---


On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11335/
 ---
 
 (Updated June 3, 2013, 10:18 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Patch includes fix and new test case.
 
 
 This addresses bug HIVE-4554.
 https://issues.apache.org/jira/browse/HIVE-4554
 
 
 Diffs
 -
 
   data/files/person PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
 bd8d252 
   ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11335/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Xuefu Zhang
 




Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-05 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21480
---

Ship it!


+1

- Ashutosh Chauhan


On June 5, 2013, 2:06 p.m., Zhuoluo Yang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11172/
 ---
 
 (Updated June 5, 2013, 2:06 p.m.)
 
 
 Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
 Shreepadma Venugopalan, and fangkun cao.
 
 
 Description
 ---
 
 An initialization error.
 Make double and long initialize correctly.
 Would you review that and assign the issue to me?
 
 
 This addresses bug HIVE-4561.
 https://issues.apache.org/jira/browse/HIVE-4561
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
  1489292 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_empty_table.q.out
  1489292 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
  1489292 
 
 Diff: https://reviews.apache.org/r/11172/diff/
 
 
 Testing
 ---
 
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q
 
 done.
 
 
 Thanks,
 
 Zhuoluo Yang
 




Re: Review Request: HIVE-4712: Fix TestCliDriver.truncate_* on 0.23

2013-06-11 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11801/#review21708
---

Ship it!


+1

- Ashutosh Chauhan


On June 11, 2013, 7:18 a.m., Brock Noland wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11801/
 ---
 
 (Updated June 11, 2013, 7:18 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Queries just needed and order by to be deterministic.
 
 
 This addresses bug HIVE-4712.
 https://issues.apache.org/jira/browse/HIVE-4712
 
 
 Diffs
 -
 
   ql/src/test/queries/clientpositive/truncate_column.q f172bae 
   ql/src/test/queries/clientpositive/truncate_column_merge.q 20ef643 
   ql/src/test/results/clientpositive/truncate_column.q.out f8af6d0 
   ql/src/test/results/clientpositive/truncate_column_merge.q.out 64a917b 
 
 Diff: https://reviews.apache.org/r/11801/diff/
 
 
 Testing
 ---
 
 Passes with both 0.20S and 0.23.
 
 
 Thanks,
 
 Brock Noland
 




Re: Branch for HIVE-4660

2013-06-13 Thread Ashutosh Chauhan
Makes sense. I will create the branch soon.

Thanks,
Ashutosh


On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner 
ghagleit...@hortonworks.com wrote:

 Hi,

 I am starting to work on integrating Tez into Hive (see HIVE-4660, design
 doc has already been uploaded - any feedback will be much appreciated).
 This will be a fair amount of work that will take time to stabilize/test.
 I'd like to propose creating a branch in order to be able to do this
 incrementally and collaboratively. In order to progress rapidly with this,
 I would also like to go commit-then-review.

 Thanks,
 Gunther.



Re: Review Request: Initialize object inspectors with union of table properties and partition properties

2013-06-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11632/#review22081
---

Ship it!


+1

- Ashutosh Chauhan


On June 4, 2013, 6:01 p.m., Mark Wagner wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11632/
 ---
 
 (Updated June 4, 2013, 6:01 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Change the initialization of object inspectors and deserializers to use the 
 union of partition properties and table properties for partitioned tables. 
 There is no change for unpartitioned tables.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 9422bf7 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java f0b16e4 
   ql/src/test/queries/clientpositive/avro_partitioned.q PRE-CREATION 
   ql/src/test/results/clientpositive/avro_partitioned.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11632/diff/
 
 
 Testing
 ---
 
 I've done manual end-to-end testing with various queries/tables and have 
 created a .q test for reading partitioned Avro tables.
 
 
 Thanks,
 
 Mark Wagner
 




Re: Branch for HIVE-4660

2013-06-19 Thread Ashutosh Chauhan
Created the branch from tip of trunk. Check it out
https://svn.apache.org/repos/asf/hive/branches/tez/

Thanks,
Ashutosh


On Thu, Jun 13, 2013 at 5:43 PM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Makes sense. I will create the branch soon.

 Thanks,
 Ashutosh


 On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner 
 ghagleit...@hortonworks.com wrote:

 Hi,

 I am starting to work on integrating Tez into Hive (see HIVE-4660, design
 doc has already been uploaded - any feedback will be much appreciated).
 This will be a fair amount of work that will take time to stabilize/test.
 I'd like to propose creating a branch in order to be able to do this
 incrementally and collaboratively. In order to progress rapidly with this,
 I would also like to go commit-then-review.

 Thanks,
 Gunther.





Re: Review Request 11925: Hive-3159 Update AvroSerde to determine schema of new tables

2013-06-29 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11925/#review22571
---


Can you also run all new tests with ant test -Dhadoop.mr.rev=23 to make sure we 
are getting right results. Else, you might need to add more columns in order-by 
columns.


serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/11925/#comment46291

I think determining schema from table definition should be default. There 
are multiple of determining schema. I think order should be:
a) Try table definition.
b) Try schema literal in properties.
c) Try from hdfs.
d) Try from url. 



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/11925/#comment46292

Any particular reason you made this synchronized ?



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/11925/#comment46293

Have you tested this for both default db as well as non-default db?



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/11925/#comment46294

Instead of \n, can you use File.Seperator?



serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java
https://reviews.apache.org/r/11925/#comment46296

Is this meant to be Array[tinyint] = bytes?



serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java
https://reviews.apache.org/r/11925/#comment46295

Lets take care of this TODO. Should be straight fwd.


- Ashutosh Chauhan


On June 18, 2013, 3:26 a.m., Mohammad Islam wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11925/
 ---
 
 (Updated June 18, 2013, 3:26 a.m.)
 
 
 Review request for hive, Ashutosh Chauhan and Jakob Homan.
 
 
 Bugs: HIVE-3159
 https://issues.apache.org/jira/browse/HIVE-3159
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Problem:
 Hive doesn't support to create a Avro-based table using HQL create table 
 command. It currently requires to specify Avro schema literal or schema file 
 name.
 For multiple cases, it is very inconvenient for user.
 Some of the un-supported use cases:
 1. Create table ... Avro-SERDE etc. as SELECT ... from NON-AVRO FILE
 2. Create table ... Avro-SERDE etc. as SELECT from AVRO TABLE
 3. Create  table  without specifying Avro schema.
 
 
 Diffs
 -
 
   ql/src/test/queries/clientpositive/avro_create_as_select.q PRE-CREATION 
   ql/src/test/queries/clientpositive/avro_create_as_select2.q PRE-CREATION 
   ql/src/test/queries/clientpositive/avro_no_schema_test.q PRE-CREATION 
   ql/src/test/queries/clientpositive/avro_without_schema.q PRE-CREATION 
   ql/src/test/results/clientpositive/avro_create_as_select.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/avro_create_as_select2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/avro_no_schema_test.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/avro_without_schema.q.out PRE-CREATION 
   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
 13848b6 
   serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java 
 PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
 010f614 
   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11925/diff/
 
 
 Testing
 ---
 
 Wrote a new java Test class for a new Java class. Added a new test case into 
 existing java test class. In addition, there are 4 .q file for testing 
 multiple use-cases.
 
 
 Thanks,
 
 Mohammad Islam
 




Re: Review Request 12050: HIVE-3756 (LOAD DATA does not honor permission inheritance)

2013-07-02 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12050/#review22698
---



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
https://reviews.apache.org/r/12050/#comment46400

I had quite a discussion on this with Rohini on HIVE-2936 on how to do 
these fs ops in a performant and robust way. Feel free to follow that.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
https://reviews.apache.org/r/12050/#comment46401

Why not use api instead of FsShell? Does filesystem doesn't offer an api 
for recursively doing chmod and chgrp?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
https://reviews.apache.org/r/12050/#comment46398

I think following way to write this is more robust:
if(inheritPerms) {
fs.mkdirs(destfp, destfp.getParent().getPerms);
} else {
fs.mkdirs(destfp);
}



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
https://reviews.apache.org/r/12050/#comment46399

same as previous comment.


Couple of comments on api usage.

- Ashutosh Chauhan


On July 2, 2013, 4:39 p.m., Chaoyu Tang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12050/
 ---
 
 (Updated July 2, 2013, 4:39 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE- and HIVE-3756
 https://issues.apache.org/jira/browse/HIVE-
 https://issues.apache.org/jira/browse/HIVE-3756
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Problems:
 1. When doing load data or insert overwrite to a table, the data files under 
 database/table directory could not inherit their parent's permissions (i.e. 
 group) as described in HIVE-3756.
 2. Beside the group issue, the read/write permission mode is also not 
 inherited
 3. Same problem affects the partition files (see HIVE-3094)
 
 Cause:
 The task results (from load data or insert overwrite) are initially stored in 
 scratchdir and then loaded under warehouse table directory. FileSystem.rename 
 is used in this step (e.g. LoadTable/LoadPartition) to move the dirs/files 
 but it preserves their permissions (including group and mode) which are 
 determined by scratchdir permission or umask. If the scratchdir has different 
 permissions from those of warehouse table directories, the problem occurs.
 
 Solution:
 After the FileSystem.rename is called, changing all renamed (moved) 
 files/dirs to their destination parents' permissions if needed (say if 
 hive.warehouse.subdir.inherit.perms is true). Here I introduced a new method 
 renameFile doing both rename and permission. It replaces the 
 FileSystem.rename used in LoadTable/LoadPartition. I do not replace rename 
 used to move files/dirs under same scratchdir in the middle of task 
 processing. It looks to me not necessary since they are temp files and also 
 probably access protected by top scratchdir mode 700 (HIVE-4487).
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 17daaa1 
 
 Diff: https://reviews.apache.org/r/12050/diff/
 
 
 Testing
 ---
 
 The following cases tested that all created subdirs/files inherit their 
 parents' permission mode and group in : 1). create database; 2). create 
 table; 3). load data; 4) insert overwrite; 5) partitions.
 {code}
 hive dfs -ls -d /user/tester1/hive;  
  
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:20 
 /user/tester1/hive
 
 hive create database tester1 COMMENT 'Database for user tester1' LOCATION 
 '/user/tester1/hive/tester1.db';
 hive dfs -ls -R /user/tester1/hive;  
   
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:21 
 /user/tester1/hive/tester1.db
 
 hive use tester1;
 hive  create table tester1.tst1(col1 int, col2 string) ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
 hive dfs -ls -R /user/tester1/hive;  
   
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db/tst1
 
 hive  load data local inpath '/home/tester1/tst1.input' into table tst1; 
   
 hive dfs -ls -R /user/tester1/hive; 
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1
 -rw-rw   3 tester1 testgroup123168 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1/tst1

Re: Hive Jenkins Builds

2013-07-03 Thread Ashutosh Chauhan
I think we should disable 0.9 and 0.10 builds. But we should create 0.11
build, since that is our last release.

Ashutosh


On Wed, Jul 3, 2013 at 11:39 AM, Brock Noland br...@cloudera.com wrote:

 Hive has four builds that currently run:

 Hive-trunk-h0.21  (trunk on Hadoop 1.X)
 Hive-trunk-hadoop2
 Hive-0.9.1-SNAPSHOT-h0.21
 Hive-0.10.0-SNAPSHOT-h0.20.1

 See https://builds.apache.org/user/brock/my-views/view/hive/

 AFAIK there isn't active work on the 0.9.X and 0.10.X branches. Does anyone
 have an issue with disabling:

 Hive-0.9.1-SNAPSHOT-h0.21
 Hive-0.10.0-SNAPSHOT-h0.20.1

 Brock



Re: Tez branch and tez based patches

2013-07-17 Thread Ashutosh Chauhan
On Wed, Jul 17, 2013 at 1:41 PM, Edward Capriolo edlinuxg...@gmail.comwrote:


 In my opinion we should limit the amount of tez related optimizations to
 and trunk Refactoring that cleans up code is good, but as you have pointed
 out there wont be a tez release until sometime this fall, and this branch
 will be open for an extended period of time. Thus code cleanups and other
 tez related refactoring does not need to be disruptive to trunk.


I agree Tez specific changes need not to go in trunk. But general
refactoring and code cleanup needs to happen on trunk as and when someone
is willing to work on those. We have to continually improve our code
quality. Code maintainability and readability is a priority. Without that
code quality suffers and discourages new contributors to contribute because
code is unnecessarily complicated. SemanticAnalyzer is 11K line class. We
need to simplify it. Patch like HIVE-4811 is a welcome change which tackled
it. Exec package is all convoluted which mixes up runtime operators and
drivers for runtime. Thats a welcome patch because it makes it much more
easy to read and reason about that piece of code. HIVE-4825 is another
example which improves modularity of code. For contributors who are exposed
to Hive first time it will be easier for them to follow the code.

Rather than disruptive to trunk, they are constructive for trunk and I am
glad people are choosing to work on that. Tez or no Tez Hive is better off
with these patches.

Thanks,
Ashutosh



  On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates ga...@hortonworks.com
 wrote:

  Answers to some of your questions inlined.
 
  Alan.
 
  On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote:
 
   There are some points I want to bring up. First, I am on the PMC. Here
 is
   something I find relevant:
  
   http://www.apache.org/foundation/how-it-works.html
  
   --
  
   The role of the PMC from a Foundation perspective is oversight. The
 main
   role of the PMC is not code and not coding - but to ensure that all
 legal
   issues are addressed, that procedure is followed, and that each and
 every
   release is the product of the community as a whole. That is key to our
   litigation protection mechanisms.
  
   Secondly the role of the PMC is to further the long term development
 and
   health of the community as a whole, and to ensure that balanced and
 wide
   scale peer review and collaboration does happen. Within the ASF we
 worry
   about any community which centers around a few individuals who are
  working
   virtually uncontested. We believe that this is detrimental to quality,
   stability, and robustness of both code and long term social structures.
  
   
  
  
 
 https://blogs.apache.org/comdev/entry/what_makes_apache_projects_different
  
   -
  
   All other decisions happen on the dev list, discussions on the private
  list
   are kept to a minimum.
  
   If it didn't happen on the dev list, it didn't happen - which leads
 to:
  
   a) Elections of committers and PMC members are published on the dev
 list
   once finalized.
  
   b) Out-of-band discussions (IRC etc.) are summarized on the dev list as
   soon as they have impact on the project, code or community.
   -
  
   https://issues.apache.org/jira/browse/HIVE-4660 ironically titled Let
   their be Tez has not be +1 ed by any committer. It was never discussed
  on
   the dev or the user list (as far as I can tell).
 
  As all JIRA creations and updates are sent to dev@hive, creating a JIRA
  is de facto posting to the list.
 
  
   As a PMC member I feel we need more discussion on Tez on the dev list
  along
   with a wiki-fied design document. Topics of discussion should include:
 
  I talked with Gunther and he's working on posting a design doc on the
  wiki.  He has a PDF on the JIRA but he doesn't have write permissions yet
  on the wiki.
 
  
   1) What is tez?
  In Hadoop 2.0, YARN opens up the ability to have multiple execution
  frameworks in Hadoop.  Hadoop apps are no longer tied to MapReduce as the
  only execution option.  Tez is an effort to build an execution engine
 that
  is optimized for relational data processing, such as Hive and Pig.
 
  The biggest change here is to move away from only Map and Reduce as
  processing options and to allow alternate combinations of processing,
 such
  as map - reduce - reduce or tasks that take multiple inputs or shuffles
  that avoid sorting when it isn't needed.
 
  For a good intro to Tez, see Arun's presentation on it at the recent
  Hadoop summit (video http://www.youtube.com/watch?v=9ZLLzlsz7h8 slides
  http://www.slideshare.net/Hadoop_Summit/murhty-saha-june26255pmroom212)
  
   2) How is tez different from oozie, http://code.google.com/p/hop/,
   http://cs.brown.edu/~backman/cmr.html , and other DAG and or streaming
  map
   reduce tools/frameworks? Why should we use this and 

Re: Review Request 12767: [HIVE-4877] In ExecReducer, remove tag from the row which will be passed to the first Operator at the Reduce-side

2013-07-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12767/#review23531
---



ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java
https://reviews.apache.org/r/12767/#comment47421

Should we also add following in comment?
.. and directly call process on children in process() method.


One more : )

- Ashutosh Chauhan


On July 19, 2013, 5 p.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12767/
 ---
 
 (Updated July 19, 2013, 5 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4877
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Diffs
 -
 
   data/files/kv1kv2.cogroup.txt 6d36e22 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java 9898495 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java d4be3d9 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ee76917 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java cbda70b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java d12a53c 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 6a74ae4 
 
 Diff: https://reviews.apache.org/r/12767/diff/
 
 
 Testing
 ---
 
 Tests FailuresErrors  Success rateTime
 2688  2   0   99.93%  43249.945
 
 Two failures are hbase_stats_empty_partition.q and ppd_key_ranges.q in 
 TestHBaseCliDriver.
 
 I manually tested these two in my mac and tests passed. 
 
 
 Thanks,
 
 Yin Huai
 




Re: Review Request 12767: [HIVE-4877] In ExecReducer, remove tag from the row which will be passed to the first Operator at the Reduce-side

2013-07-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12767/#review23541
---

Ship it!


Ship It!

- Ashutosh Chauhan


On July 19, 2013, 7:04 p.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12767/
 ---
 
 (Updated July 19, 2013, 7:04 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4877
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Diffs
 -
 
   data/files/kv1kv2.cogroup.txt 6d36e22 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java 9898495 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java d4be3d9 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ee76917 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java cbda70b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java d12a53c 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 6a74ae4 
 
 Diff: https://reviews.apache.org/r/12767/diff/
 
 
 Testing
 ---
 
 Tests FailuresErrors  Success rateTime
 2688  2   0   99.93%  43249.945
 
 Two failures are hbase_stats_empty_partition.q and ppd_key_ranges.q in 
 TestHBaseCliDriver.
 
 I manually tested these two in my mac and tests passed. 
 
 
 Thanks,
 
 Yin Huai
 




Re: Review Request 12767: [HIVE-4877] In ExecReducer, remove tag from the row which will be passed to the first Operator at the Reduce-side

2013-07-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12767/#review23527
---


Good work, Yin! some minor comments.


ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java
https://reviews.apache.org/r/12767/#comment47408

I didn't get why there is an if check here? Can you add a comment 
explaining in which case we need not to update this childOIs map?



ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java
https://reviews.apache.org/r/12767/#comment47409

I think we should remove this if branch since its in inner loop of 
processing. We should put this check in initialization time of the Demux 
operator. Even if we cannot put it there, this will result in runtime exception 
which I think is fine.



ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java
https://reviews.apache.org/r/12767/#comment47410

Is this better wording for this comment:
// Demux operator forwards a row to exactly one child in its children list 
based on the tag and newTagToChildIndex in process() method, so we need not to 
do anything in here.



ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java
https://reviews.apache.org/r/12767/#comment47414

Can you also add a line in comment saying, this key-val-tag structure is 
used by JoinOperator and Groupby operators to function correctly.




ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java
https://reviews.apache.org/r/12767/#comment47411

Same comment as in Demux operator.



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java
https://reviews.apache.org/r/12767/#comment47417

I think this should read as:
// remove the tag from key coming out of reducer and store it in separate 
variable.


- Ashutosh Chauhan


On July 19, 2013, 5 p.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12767/
 ---
 
 (Updated July 19, 2013, 5 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-4877
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-4877
 
 
 Diffs
 -
 
   data/files/kv1kv2.cogroup.txt 6d36e22 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java 9898495 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MuxOperator.java d4be3d9 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ee76917 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java cbda70b 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java d12a53c 
   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 6a74ae4 
 
 Diff: https://reviews.apache.org/r/12767/diff/
 
 
 Testing
 ---
 
 Tests FailuresErrors  Success rateTime
 2688  2   0   99.93%  43249.945
 
 Two failures are hbase_stats_empty_partition.q and ppd_key_ranges.q in 
 TestHBaseCliDriver.
 
 I manually tested these two in my mac and tests passed. 
 
 
 Thanks,
 
 Yin Huai
 




Re: Review Request 12690: HIVE-4870: Explain Extended to show partition info for Fetch Task

2013-07-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12690/#review23654
---

Ship it!


Ship It!

- Ashutosh Chauhan


On July 17, 2013, 5:14 p.m., John Pullokkaran wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12690/
 ---
 
 (Updated July 17, 2013, 5:14 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Explain extended does not include partition information for Fetch Task 
 (FetchWork). Map Reduce Task (MapredWork)already does this.
 
 Patch adds Partition Description info to Fetch Task.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java 65c39d6 
   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 0e8f96b 
   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 42e25fa 
   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 47a8635 
   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c39d057 
   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out bd7381f 
   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 6121722 
   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e0cd848 
   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 924fbad 
   ql/src/test/results/clientpositive/bucketcontext_1.q.out 62910fb 
   ql/src/test/results/clientpositive/bucketcontext_2.q.out 0857c9d 
   ql/src/test/results/clientpositive/bucketcontext_3.q.out 69dc2b2 
   ql/src/test/results/clientpositive/bucketcontext_4.q.out 0d79901 
   ql/src/test/results/clientpositive/bucketcontext_7.q.out 19ea4fa 
   ql/src/test/results/clientpositive/bucketcontext_8.q.out 9a7aaa0 
   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 9f8552a 
   ql/src/test/results/clientpositive/bucketmapjoin10.q.out 1a6bc06 
   ql/src/test/results/clientpositive/bucketmapjoin11.q.out bd9b1fe 
   ql/src/test/results/clientpositive/bucketmapjoin12.q.out fc161a9 
   ql/src/test/results/clientpositive/bucketmapjoin13.q.out 30d8925 
   ql/src/test/results/clientpositive/bucketmapjoin2.q.out 7f3fb3e 
   ql/src/test/results/clientpositive/bucketmapjoin3.q.out 913e925 
   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 8105ba4 
   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 92c74a9 
   ql/src/test/results/clientpositive/bucketmapjoin9.q.out b7aec66 
   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1dd45d2 
   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 37f4a48 
   ql/src/test/results/clientpositive/join32.q.out 92d81b9 
   ql/src/test/results/clientpositive/join32_lessSize.q.out 82b3e4a 
   ql/src/test/results/clientpositive/join33.q.out 92d81b9 
   ql/src/test/results/clientpositive/sort_merge_join_desc_6.q.out f6aae06 
   ql/src/test/results/clientpositive/sort_merge_join_desc_7.q.out dbce51a 
   ql/src/test/results/clientpositive/stats11.q.out 57d2f9a 
   ql/src/test/results/clientpositive/union22.q.out bec39f4 
 
 Diff: https://reviews.apache.org/r/12690/diff/
 
 
 Testing
 ---
 
 All the hive unit tests passed.
 
 
 Thanks,
 
 John Pullokkaran
 




Re: Review Request 12705: HIVE-4878: With Dynamic partitioning, some queries would scan default partition even if query is not using it.

2013-07-22 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12705/#review23657
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
https://reviews.apache.org/r/12705/#comment47555

Why are we restricting this for strict mode? We should skip default 
partition in all cases unless explicitly requested by user. Assumption is 
default partition contains rows which were malformed in some ways at load times 
and will be excluded from all further query processing.


- Ashutosh Chauhan


On July 17, 2013, 10:19 p.m., John Pullokkaran wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12705/
 ---
 
 (Updated July 17, 2013, 10:19 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 With Dynamic partitioning, Hive would scan default partitions in some cases 
 even if query excludes it. As part of partition pruning, predicate is 
 narrowed down to those pieces that involve partition columns only. This 
 predicate is then evaluated with partition values to determine, if scan 
 should include those partitions.
 But in some cases (like when comparing _HIVE_DEFAULT_PARTITION_ to numeric 
 data types) expression evaluation would fail and would return NULL instead of 
 true/false. In such cases the partition is added to unknown partitions which 
 is then subsequently scanned.
 
 This fix avoids scanning default partition if all of the following is true:
 a) Hive dynamic partition mode is strict 
 (hive.exec.dynamic.partition.mode=strict).
 b) partition pruning expression failed to evaluate for a given partition.
 c) at the least one of the columns in the partition is default partition.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
 6a4a360 
   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/12705/diff/
 
 
 Testing
 ---
 
 Hive Unit Tests Passed.
 
 
 Thanks,
 
 John Pullokkaran
 




Re: Review Request 12050: HIVE-3756 (LOAD DATA does not honor permission inheritance)

2013-07-25 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12050/#review23858
---

Ship it!


Ship It!

- Ashutosh Chauhan


On July 19, 2013, 6:55 p.m., Chaoyu Tang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/12050/
 ---
 
 (Updated July 19, 2013, 6:55 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan and Sushanth Sowmyan.
 
 
 Bugs: HIVE- and HIVE-3756
 https://issues.apache.org/jira/browse/HIVE-
 https://issues.apache.org/jira/browse/HIVE-3756
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Problems:
 1. When doing load data or insert overwrite to a table, the data files under 
 database/table directory could not inherit their parent's permissions (i.e. 
 group) as described in HIVE-3756.
 2. Beside the group issue, the read/write permission mode is also not 
 inherited
 3. Same problem affects the partition files (see HIVE-3094)
 
 Cause:
 The task results (from load data or insert overwrite) are initially stored in 
 scratchdir and then loaded under warehouse table directory. FileSystem.rename 
 is used in this step (e.g. LoadTable/LoadPartition) to move the dirs/files 
 but it preserves their permissions (including group and mode) which are 
 determined by scratchdir permission or umask. If the scratchdir has different 
 permissions from those of warehouse table directories, the problem occurs.
 
 Solution:
 After the FileSystem.rename is called, changing all renamed (moved) 
 files/dirs to their destination parents' permissions if needed (say if 
 hive.warehouse.subdir.inherit.perms is true). Here I introduced a new method 
 renameFile doing both rename and permission. It replaces the 
 FileSystem.rename used in LoadTable/LoadPartition. I do not replace rename 
 used to move files/dirs under same scratchdir in the middle of task 
 processing. It looks to me not necessary since they are temp files and also 
 probably access protected by top scratchdir mode 700 (HIVE-4487).
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 87a584d 
 
 Diff: https://reviews.apache.org/r/12050/diff/
 
 
 Testing
 ---
 
 The following cases tested that all created subdirs/files inherit their 
 parents' permission mode and group in : 1). create database; 2). create 
 table; 3). load data; 4) insert overwrite; 5) partitions.
 {code}
 hive dfs -ls -d /user/tester1/hive;  
  
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:20 
 /user/tester1/hive
 
 hive create database tester1 COMMENT 'Database for user tester1' LOCATION 
 '/user/tester1/hive/tester1.db';
 hive dfs -ls -R /user/tester1/hive;  
   
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:21 
 /user/tester1/hive/tester1.db
 
 hive use tester1;
 hive  create table tester1.tst1(col1 int, col2 string) ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
 hive dfs -ls -R /user/tester1/hive;  
   
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db/tst1
 
 hive  load data local inpath '/home/tester1/tst1.input' into table tst1; 
   
 hive dfs -ls -R /user/tester1/hive; 
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:22 
 /user/tester1/hive/tester1.db
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1
 -rw-rw   3 tester1 testgroup123168 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1/tst1.input
 
 hive create table tester1.tst2(col1 int, col2 string) ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY ',' STORED AS SEQUENCEFILE;
 hive dfs -ls -R /user/tester1/hive;  
   
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:24 
 /user/tester1/hive/tester1.db
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1
 -rw-rw   3 tester1 testgroup123168 2013-06-22 13:23 
 /user/tester1/hive/tester1.db/tst1/tst1.input
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:24 
 /user/tester1/hive/tester1.db/tst2
 
 hive insert overwrite table tst2 select * from tst1;
 hive dfs -ls -R /user/tester1/hive; 
 drwxrwx---   - tester1 testgroup123  0 2013-06-22 13:25 
 /user/tester1/hive/tester1.db

Re: Adding to the hive contributor list

2013-08-14 Thread Ashutosh Chauhan
Done. Welcome Hari to the project.

Thanks,
Ashutosh


On Wed, Aug 14, 2013 at 10:32 AM, Hari Subramaniyan 
hsubramani...@hortonworks.com wrote:

 Hi,
 I would like to get added to contributor list.

 Thanks
 Hari

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Adding WebHCat sub component to Hive project in ASF Jira

2013-08-16 Thread Ashutosh Chauhan
Done. Looking forward to contributions in that area!

Thanks,
Ashutosh


On Fri, Aug 16, 2013 at 11:44 AM, Eugene Koifman
ekoif...@hortonworks.comwrote:

 Hi,
 could somebody who has permissions to do so create WebHCat component under
 Hive?
 It will help track things.

 Thanks,
 Eugene

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Last time request for cwiki update privileges

2013-08-20 Thread Ashutosh Chauhan
Hi Sanjay,

Really sorry for that. I apologize for the delay. You are added now. Feel
free to make changes to make Hive even better!

Thanks,
Ashutosh


On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.com wrote:

  Hey guys

  I can only think of two reasons for my request is not yet accepted

  1. The admins don't want to give me access

  2. The admins have not seen my mail yet.

  This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

  Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

  Regards

  sanjay




   From: Sanjay Subramanian sanjay.subraman...@wizecommerce.com
 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.org u...@hive.apache.org
 Cc: dev@hive.apache.org dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

   Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

  May the Force be with u

  Thanks

  sanjay

   From: Sanjay Subramanian sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.org u...@hive.apache.org
 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.org u...@hive.apache.org
 Cc: dev@hive.apache.org dev@hive.apache.org
 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

   Hi guys

  Any chance I could get cwiki update privileges today ?

  Thanks

  sanjay

   From: Sanjay Subramanian sanjay.subraman...@wizecommerce.com
 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.org u...@hive.apache.org
 Cc: dev@hive.apache.org dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

Hi

  Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

  Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

1. Language 
 Manualhttps://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
bullet under File Formats)
2. LZO 
 Compressionhttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
3. CREATE 
 TABLEhttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
  (near
end of section, pasted in here:)

Use STORED AS TEXTFILE if the data needs to be stored as plain text
files. Use STORED AS SEQUENCEFILE if the data needs to be compressed.
Please read more about 
 CompressedStoragehttps://cwiki.apache.org/confluence/display/Hive/CompressedStorage
  if
you are planning to keep data compressed in your Hive tables. Use
INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
InputFormat and OutputFormat class as a string literal, e.g.,
'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
For LZO compression, the values to use are 'INPUTFORMAT
com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO

 Compressionhttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
).



   My cwiki id is
 https://cwiki.apache.org/confluence/display/~sanjaysubraman...@yahoo.com
 It will be great if I could get edit privileges

  Thanks
 sanjay

 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.

 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.



Re: Last time request for cwiki update privileges

2013-08-21 Thread Ashutosh Chauhan
Hey Mikhail,

Sure. Whats ur cwiki id?

Thanks,
Ashutosh


On Wed, Aug 21, 2013 at 1:58 PM, Mikhail Antonov olorinb...@gmail.comwrote:

 Can I also get the edit privilege for wiki please?

 I'd like to add some details about LDAP authentication..

 Mikhail


 2013/8/21 Stephen Sprague sprag...@gmail.com

 Sanjay gets some love after all! :)


 On Tue, Aug 20, 2013 at 4:00 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.com wrote:

 Thanks Ashutosh

 From: Ashutosh Chauhan hashut...@apache.orgmailto:hashut...@apache.org
 
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Tuesday, August 20, 2013 3:13 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Last time request for cwiki update privileges

 Hi Sanjay,

 Really sorry for that. I apologize for the delay. You are added now.
 Feel free to make changes to make Hive even better!

 Thanks,
 Ashutosh


 On Tue, Aug 20, 2013 at 2:39 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com wrote:
 Hey guys

 I can only think of two reasons for my request is not yet accepted

 1. The admins don't want to give me access

 2. The admins have not seen my mail yet.

 This is the fourth and the LAST time I am requesting permission to edit
 wiki docs…Nobody likes being ignored and that includes me.

 Meanwhile to show my thankfulness to the Hive community I shall continue
 to answer questions .There will be no change in that behavior

 Regards

 sanjay




 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Wednesday, August 14, 2013 3:52 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Once again, I am down on my knees humbling calling upon the Hive Jedi
 Masters to please provide this paadwaan  with cwiki update privileges

 May the Force be with u

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com
 Reply-To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org

 Date: Wednesday, July 31, 2013 9:38 AM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org

 Subject: Re: Review Request (wikidoc): LZO Compression in Hive

 Hi guys

 Any chance I could get cwiki update privileges today ?

 Thanks

 sanjay

 From: Sanjay Subramanian sanjay.subraman...@wizecommerce.commailto:
 sanjay.subraman...@wizecommerce.com

 Date: Tuesday, July 30, 2013 4:26 PM
 To: u...@hive.apache.orgmailto:u...@hive.apache.org 
 u...@hive.apache.orgmailto:u...@hive.apache.org
 Cc: dev@hive.apache.orgmailto:dev@hive.apache.org 
 dev@hive.apache.orgmailto:dev@hive.apache.org
 Subject: Review Request (wikidoc): LZO Compression in Hive

 Hi

 Met with Lefty this afternoon and she was kind to spend time to add my
 documentation to the site - since I still don't have editing privileges :-)

 Please review the new wikidoc about LZO compression in the Hive language
 manual.  If anything is unclear or needs more information, you can email
 suggestions to this list or edit the wiki yourself (if you have editing
 privileges).  Here are the links:

   1.  Language Manual
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual (new
 bullet under File Formats)
   2.  LZO Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO
   3.  CREATE TABLE
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
 (near end of section, pasted in here:)
 Use STORED AS TEXTFILE if the data needs to be stored as plain text
 files. Use STORED AS SEQUENCEFILE if the data needs to be compressed.
 Please read more about CompressedStorage
 https://cwiki.apache.org/confluence/display/Hive/CompressedStorage if
 you are planning to keep data compressed in your Hive tables. Use
 INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding
 InputFormat and OutputFormat class as a string literal, e.g.,
 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.
 For LZO compression, the values to use are 'INPUTFORMAT
 com.hadoop.mapred.DeprecatedLzoTextInputFormat OUTPUTFORMAT
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' (see LZO
 Compression
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO).


 My cwiki id is
 https://cwiki.apache.org/confluence/display/~sanjaysubraman

Re: Last time request for cwiki update privileges

2013-08-21 Thread Ashutosh Chauhan
Not able to find this id in cwiki. Did you create an account on
cwiki.apache.org

On Wed, Aug 21, 2013 at 2:59 PM, Mikhail Antonov olorinb...@gmail.comwrote:

 mantonov


Re: LIKE filter pushdown for tables and partitions

2013-08-26 Thread Ashutosh Chauhan
Couple of questions:

1. What about LIKE operator for Hive itself? Will that continue to work
(presumably because there is an alternative path for that).
2. This will nonetheless break other direct consumers of metastore client
api (like HCatalog).

I see your point that we have a buggy implementation, so whats out there is
not safe to use. Question than really is shall we remove this code, thereby
breaking people for whom current buggy implementation is good enough (or
you can say salvage them from breaking in future). Or shall we try to fix
it now?
My take is if there are no users of this anyways, then there is no point
fixing it for non-existing users, but if there are we probably have to. I
will suggest you to send an email to users@hive to ask if there are users
for this.

Thanks,
Ashutosh



On Mon, Aug 26, 2013 at 2:08 PM, Sergey Shelukhin ser...@hortonworks.comwrote:

 Since there's no response I am assuming nobody cares about this code...
 Jira is HIVE-5134, I will attach a patch with removal this week.

 On Wed, Aug 21, 2013 at 2:28 PM, Sergey Shelukhin ser...@hortonworks.com
 wrote:

  Hi.
 
  I think there are issues with the way hive can currently do LIKE
  operator JDO pushdown and it the code should be removed for partitions
  and tables.
  Are there objections to removing LIKE from Filter.g and related areas?
  If no I will file a JIRA and do it.
 
  Details:
  There's code in metastore that is capable of pushing down LIKE
  expression into JDO for string partition keys, as well as tables.
  The code for tables doesn't appear used, and partition code definitely
  doesn't run in Hive proper because metastore client doesn't send LIKE
  expressions to server. It may be used in e.g. HCat and other places,
  but after asking some people here, I found out it probably isn't.
  I was trying to make it run and noticed some problems:
  1) For partitions, Hive sends SQL patterns in a filter for like, e.g.
  %foo%, whereas metastore passes them into matches() JDOQL method
  which expects Java regex.
  2) Converting the pattern to Java regex via UDFLike method, I found
  out that not all regexes appear to work in DN. .*foo seems to work
  but anything complex (such as escaping the pattern using
  Pattern.quote, which UDFLike does) breaks and no longer matches
  properly.
  3) I tried to implement common cases using JDO methods
  startsWith/endsWith/indexOf (I will file a JIRA), but when I run tests
  on Derby, they also appear to have problems with some strings (for
  example, partition with backslash in the name cannot be matched by
  LIKE %\% (single backslash in a string), after being converted to
  .indexOf(param) where param is \ (escaping the backslash once again
  doesn't work either, and anyway there's no documented reason why it
  shouldn't work properly), while other characters match correctly, even
  e.g. %.
 
  For tables, there's no SQL-like, it expects Java regex, but I am not
  convinced all Java regexes are going to work.
 
  So, I think that for future correctness sake it's better to remove this
  code.
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Review Request 13862: [HIVE-5149] ReduceSinkDeDuplication can pick the wrong partitioning columns

2013-09-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13862/#review25818
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
https://reviews.apache.org/r/13862/#comment50365

else { throw new SemanticException(Not able to correctly identify 
partitioning columns. Hint: Try hive.optimize.reducededuplication=false; );}


Thanks for adding comments!

- Ashutosh Chauhan


On Aug. 30, 2013, 3:29 p.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/13862/
 ---
 
 (Updated Aug. 30, 2013, 3:29 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5149
 https://issues.apache.org/jira/browse/HIVE-5149
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://mail-archives.apache.org/mod_mbox/hive-user/201308.mbox/%3CCAG6Lhyex5XPwszpihKqkPRpzri2k=m4qgc+cpar5yvr8sjt...@mail.gmail.com%3E
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
  c380a2d 
   ql/src/test/results/clientpositive/groupby2_map_skew.q.out da7a128 
   ql/src/test/results/clientpositive/groupby_cube1.q.out a52f4eb 
   ql/src/test/results/clientpositive/groupby_rollup1.q.out f120471 
   ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 
 3297ebb 
 
 Diff: https://reviews.apache.org/r/13862/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Yin Huai
 




Re: Review Request 13862: [HIVE-5149] ReduceSinkDeDuplication can pick the wrong partitioning columns

2013-09-01 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13862/#review25819
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
https://reviews.apache.org/r/13862/#comment50366

In here. if(result[0] = 0) throw new SemanticException(Sort columns and 
order don't match. Try hive.optimize.reducesinkdeduplication=false;);


Another sanity check.

- Ashutosh Chauhan


On Aug. 30, 2013, 3:29 p.m., Yin Huai wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/13862/
 ---
 
 (Updated Aug. 30, 2013, 3:29 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5149
 https://issues.apache.org/jira/browse/HIVE-5149
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://mail-archives.apache.org/mod_mbox/hive-user/201308.mbox/%3CCAG6Lhyex5XPwszpihKqkPRpzri2k=m4qgc+cpar5yvr8sjt...@mail.gmail.com%3E
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
  c380a2d 
   ql/src/test/results/clientpositive/groupby2_map_skew.q.out da7a128 
   ql/src/test/results/clientpositive/groupby_cube1.q.out a52f4eb 
   ql/src/test/results/clientpositive/groupby_rollup1.q.out f120471 
   ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 
 3297ebb 
 
 Diff: https://reviews.apache.org/r/13862/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Yin Huai
 




  1   2   3   4   5   6   7   8   9   10   >