[jira] [Updated] (HIVE-4141) InspectorFactories contains static HashMaps which can cause infinite loop

2013-03-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4141:
---

Attachment: HIVE-4141-2.patch

Uploaded this to RB a few days back but forgot to upload here.

 InspectorFactories contains static HashMaps which can cause infinite loop
 -

 Key: HIVE-4141
 URL: https://issues.apache.org/jira/browse/HIVE-4141
 Project: Hive
  Issue Type: Sub-task
  Components: Server Infrastructure
Reporter: Brock Noland
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4141-1.patch, HIVE-4141-2.patch


 When many clients hit hs2, hs2 can get stuck in an infinite loop due to 
 concurrent modification of the static maps here:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java
 and in other ObjectFactories. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4141) InspectorFactories contains static HashMaps which can cause infinite loop

2013-03-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland reassigned HIVE-4141:
--

Assignee: Brock Noland

 InspectorFactories contains static HashMaps which can cause infinite loop
 -

 Key: HIVE-4141
 URL: https://issues.apache.org/jira/browse/HIVE-4141
 Project: Hive
  Issue Type: Sub-task
  Components: Server Infrastructure
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4141-1.patch, HIVE-4141-2.patch


 When many clients hit hs2, hs2 can get stuck in an infinite loop due to 
 concurrent modification of the static maps here:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java
 and in other ObjectFactories. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4141) InspectorFactories contains static HashMaps which can cause infinite loop

2013-03-15 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1360#comment-1360
 ] 

Brock Noland commented on HIVE-4141:


Sorry looks like the first patch was committed. The differences were only style 
improvements over the original code so I don't think this is an issue. I'll 
file a follow up JIRA to implement the style improvements requested by Owen in 
the review.

 InspectorFactories contains static HashMaps which can cause infinite loop
 -

 Key: HIVE-4141
 URL: https://issues.apache.org/jira/browse/HIVE-4141
 Project: Hive
  Issue Type: Sub-task
  Components: Server Infrastructure
Reporter: Brock Noland
Priority: Blocker
 Fix For: 0.11.0

 Attachments: HIVE-4141-1.patch, HIVE-4141-2.patch


 When many clients hit hs2, hs2 can get stuck in an infinite loop due to 
 concurrent modification of the static maps here:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java
 and in other ObjectFactories. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4182) doAS does not work with HiveServer2 in non-kerberos mode with local job

2013-03-15 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-4182:
---

 Summary: doAS does not work with HiveServer2 in non-kerberos mode 
with local job
 Key: HIVE-4182
 URL: https://issues.apache.org/jira/browse/HIVE-4182
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair


When HiveServer2 is configured without kerberos security enabled, and the query 
gets launched as a local map-reduce job, the job runs as the user hive server 
is running as , instead of the user who submitted the query.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4182) doAS does not work with HiveServer2 in non-kerberos mode with local job

2013-03-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4182:


Attachment: HIVE-4168.1.patch

HIVE-4168.1.patch - initial patch

 doAS does not work with HiveServer2 in non-kerberos mode with local job
 ---

 Key: HIVE-4182
 URL: https://issues.apache.org/jira/browse/HIVE-4182
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
  Labels: HiveServer2
 Attachments: HIVE-4168.1.patch


 When HiveServer2 is configured without kerberos security enabled, and the 
 query gets launched as a local map-reduce job, the job runs as the user hive 
 server is running as , instead of the user who submitted the query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1662) Add file pruning into Hive.

2013-03-15 Thread Rajat Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603487#comment-13603487
 ] 

Rajat Jain commented on HIVE-1662:
--

We are following this patch because we wanted to implement something similar. 
Is it possible if we can do the pruning much before this step - in 
GenMapRedUtils.setTaskPlan. This will take care of the reducer estimation as 
well. Are there any issues with this?

 Add file pruning into Hive.
 ---

 Key: HIVE-1662
 URL: https://issues.apache.org/jira/browse/HIVE-1662
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang
Assignee: Navis
 Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, 
 HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch


 now hive support filename virtual column. 
 if a file name filter presents in a query, hive should be able to only add 
 files which passed the filter to input paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4182) doAS does not work with HiveServer2 in non-kerberos mode with local job

2013-03-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4182:


Attachment: (was: HIVE-4168.1.patch)

 doAS does not work with HiveServer2 in non-kerberos mode with local job
 ---

 Key: HIVE-4182
 URL: https://issues.apache.org/jira/browse/HIVE-4182
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
  Labels: HiveServer2

 When HiveServer2 is configured without kerberos security enabled, and the 
 query gets launched as a local map-reduce job, the job runs as the user hive 
 server is running as , instead of the user who submitted the query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4182) doAS does not work with HiveServer2 in non-kerberos mode with local job

2013-03-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4182:


Attachment: HIVE-4182.1.patch

HIVE-4182.1.patch - the correct (initial) patch

 doAS does not work with HiveServer2 in non-kerberos mode with local job
 ---

 Key: HIVE-4182
 URL: https://issues.apache.org/jira/browse/HIVE-4182
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
  Labels: HiveServer2
 Attachments: HIVE-4182.1.patch


 When HiveServer2 is configured without kerberos security enabled, and the 
 query gets launched as a local map-reduce job, the job runs as the user hive 
 server is running as , instead of the user who submitted the query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3963) Allow Hive to connect to RDBMS

2013-03-15 Thread Maxime LANCIAUX (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603506#comment-13603506
 ] 

Maxime LANCIAUX commented on HIVE-3963:
---

I will work on improving the implementation soon so I am looking for any advice 
(especially how to remove the need to the DUAL table) ! Thanks.

 Allow Hive to connect to RDBMS
 --

 Key: HIVE-3963
 URL: https://issues.apache.org/jira/browse/HIVE-3963
 Project: Hive
  Issue Type: New Feature
  Components: Import/Export, JDBC, SQL, StorageHandler
Affects Versions: 0.10.0, 0.9.1, 0.11.0
Reporter: Maxime LANCIAUX
 Fix For: 0.10.1

 Attachments: patchfile


 I am thinking about something like :
 SELECT jdbcload('driver','url','user','password','sql') FROM dual;
 There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
 JDBCStorageHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4041) Support multiple partitionings in a single Query

2013-03-15 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603591#comment-13603591
 ] 

Phabricator commented on HIVE-4041:
---

ashutoshc has commented on the revision HIVE-4041 [jira] Support multiple 
partitionings in a single Query.

  Some more questions.

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:415
 I see. I thought with following query I can simulate the same problem even on 
trunk.
  select 1 from over10k group by 1;

  But this didn't result in NPE and query ran successfully. Is this query good 
approximation to simulate this path ? My motivation is somehow to simulate this 
code path without over clause and thus expose bug on trunk and fix it there, so 
we don't need to do this in branch.
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java:212
 Hmm. I think we hold on to the schema for PTFOp way too early in semantic 
phase. Apart from changes required here, this holding on to the schema is not 
playing well with other compile time optimization which hive does after 
semantic analysis. Other operators don't do this. I think we need to spend a 
bit of time on this. Can you point to me where we hold on to schema in 
SemanticAnalyzer and why is it necessary?
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java:482 I am fine 
with doing it in follow-up. But if possible we should get rid of this. This 
probably result in runtime perf impact since I think this will force hadoop 
secondary sort so that values for a given key come out sorted. Further, adding 
extra constraints will lessen the opportunity to do compile time optimizations 
like filter push down (see my comments on HIVE-4180).
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingComponentizer.java:38 It 
will be good to define group more concretely. If I am getting this right, this 
is group of over functions which has same partitioning. Is that correct ?
  So, a group may have multiple functions associated with it (but all on same 
partitioning). So, group - one PTFOp on which there will be multiple functions 
working? Or a group implies multiple PTFOp chained in same reducer one after 
other each working on their own function.
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingComponentizer.java:85 
Which filter this is? Is this having clause ? But I thought we already removed 
support for that. If not, I think we should. Or this regular where clause. If 
later, we should not consume other operators of query in PTOperator.
  ql/src/test/queries/clientpositive/windowing_multipartitioning.q:21 It will 
be good to add more tests from the google document which I shared with you. It 
has multipartitioning tests towards the end.

REVISION DETAIL
  https://reviews.facebook.net/D9381

To: JIRA, ashutoshc, hbutani


 Support multiple partitionings in a single Query
 

 Key: HIVE-4041
 URL: https://issues.apache.org/jira/browse/HIVE-4041
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4041.D9381.1.patch, WindowingComponentization.pdf


 Currently we disallow queries if the partition specifications of all Wdw fns 
 are not the same. We can relax this by generating multiple PTFOps based on 
 the unique partitionings in a Query. For partitionings that only differ in 
 sort, we can introduce a sort step in between PTFOps, which can happen in the 
 same Reduce task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4147) Slow Hive JDBC in concurrency mode to create/drop table

2013-03-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland reassigned HIVE-4147:
--

Assignee: Brock Noland

 Slow Hive JDBC in concurrency mode to create/drop table
 ---

 Key: HIVE-4147
 URL: https://issues.apache.org/jira/browse/HIVE-4147
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.10.0
Reporter: Alexander Alten-Lorenz
Assignee: Brock Noland

 It's very slow using hive jdbc in concurrency mode to create/drop table, 
 which is 20 times slower than using HiveMetatstoreClient.
 test steps: 
 1. create 100 different hive table one by one by using hive jdbc: create 
 table .. 
 2. drop table one by one by using hive jdbc: drop table .. and timing 
 3. create 100 different hive table one by one by using hive jdbc: create 
 table .. 
 4. drop tables one by one by using new 
 HiveMetatstoreClient().dropTable(default, table_name) and timing
 results 
 step 2 is 20 times slower than step 4. 
 basically hive jdbc is 20 times slower than HiveMetatstoreClient not only 
 create/table, but also the same kind of calls.
 Dropping tables via this low level API could cause issues if there are any 
 clients concurrently querying the table. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603604#comment-13603604
 ] 

Gang Tim Liu commented on HIVE-4145:


+1

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4183) Implement Style changes to InspectorFactories

2013-03-15 Thread Brock Noland (JIRA)
Brock Noland created HIVE-4183:
--

 Summary: Implement Style changes to InspectorFactories
 Key: HIVE-4183
 URL: https://issues.apache.org/jira/browse/HIVE-4183
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Trivial


In HIVE-4141 we updated the InspectorFactories to use concurrent data 
structures. In the review Owen had requested some style updates. That is the 
original code typed the variables as HashMap and I just modified it to be 
ConcurrentHashMap. I updated the review item but forgot to attach the patch to 
JIRA so the patch which was committed did not have these style updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4147) Slow Hive JDBC in concurrency mode to create/drop table

2013-03-15 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4147:
-

Component/s: HiveServer2

 Slow Hive JDBC in concurrency mode to create/drop table
 ---

 Key: HIVE-4147
 URL: https://issues.apache.org/jira/browse/HIVE-4147
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Alexander Alten-Lorenz
Assignee: Brock Noland

 It's very slow using hive jdbc in concurrency mode to create/drop table, 
 which is 20 times slower than using HiveMetatstoreClient.
 test steps: 
 1. create 100 different hive table one by one by using hive jdbc: create 
 table .. 
 2. drop table one by one by using hive jdbc: drop table .. and timing 
 3. create 100 different hive table one by one by using hive jdbc: create 
 table .. 
 4. drop tables one by one by using new 
 HiveMetatstoreClient().dropTable(default, table_name) and timing
 results 
 step 2 is 20 times slower than step 4. 
 basically hive jdbc is 20 times slower than HiveMetatstoreClient not only 
 create/table, but also the same kind of calls.
 Dropping tables via this low level API could cause issues if there are any 
 clients concurrently querying the table. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-4184:
-

 Summary: Document HiveServer2 setup under the admin documentation 
on hive wiki 
 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar


The setup/configuration instructions for HiveServer are available on  
https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4185) Document HiveServer2 JDBC and Beeline CLI in the user documentation

2013-03-15 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-4185:
-

 Summary: Document HiveServer2 JDBC and Beeline CLI in the user 
documentation 
 Key: HIVE-4185
 URL: https://issues.apache.org/jira/browse/HIVE-4185
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar


The user documentation 
https://cwiki.apache.org/confluence/display/Hive/HiveClient includes 
information about client connection to thrift server.
We need to include HiveServer2 client details (JDBC, Beeline)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4186) NPE in ReduceSinkDeDuplication

2013-03-15 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-4186:


Attachment: HIVE-4186.1.patch.txt

 NPE in ReduceSinkDeDuplication
 --

 Key: HIVE-4186
 URL: https://issues.apache.org/jira/browse/HIVE-4186
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
 Attachments: HIVE-4186.1.patch.txt


 When you have a sequence of RedueSinks on constants you get this error:
 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.optimizer.ReduceSinkDeDuplication$ReduceSinkDeduplicateProcFactory$ReducerReducerProc.getPartitionAndKeyColumnMapping(ReduceSinkDeDuplication.java:416)
 {noformat}
 The e.g. to generate this si:
 {noformat}
 select p_name from (select p_name from part distribute by 1 sort by 1) p 
 distribute by 1 sort by 1
 {noformat}
 Sorry for the contrived e.g., but this actually happens when we stack 
 windowing clauses (see PTF-Windowing branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4186) NPE in ReduceSinkDeDuplication

2013-03-15 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603728#comment-13603728
 ] 

Harish Butani commented on HIVE-4186:
-

patch is attached.

 NPE in ReduceSinkDeDuplication
 --

 Key: HIVE-4186
 URL: https://issues.apache.org/jira/browse/HIVE-4186
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
 Attachments: HIVE-4186.1.patch.txt


 When you have a sequence of RedueSinks on constants you get this error:
 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.optimizer.ReduceSinkDeDuplication$ReduceSinkDeduplicateProcFactory$ReducerReducerProc.getPartitionAndKeyColumnMapping(ReduceSinkDeDuplication.java:416)
 {noformat}
 The e.g. to generate this si:
 {noformat}
 select p_name from (select p_name from part distribute by 1 sort by 1) p 
 distribute by 1 sort by 1
 {noformat}
 Sorry for the contrived e.g., but this actually happens when we stack 
 windowing clauses (see PTF-Windowing branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Carl Steinbach (JIRA)
Carl Steinbach created HIVE-4187:


 Summary: QL build-grammar target fails after HIVE-4148
 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603732#comment-13603732
 ] 

Prasad Mujumdar commented on HIVE-4184:
---

  HiveServer2 (HS2) is a thrift base server that enables remote clients to 
execute queries against hive and retrieve the results. Its an improved version 
of HiveServer that supports multi-client concurrency and authentication. It 
uses a new thrift interface with concurrency support. It is  designed to 
provide better support for open API clients like JDBC and ODBC. The thrift IDL 
is available at 
https://github.com/apache/hive/blob/trunk/service/if/TCLIService.thrift

How to configure -
  Configuration properties in hive-site.xml
hive.server2.thrift.min.worker.threads - Number of minimum worker threads, 
default 5.
hive.server2.thrift.max.worker.threads - Number of minimum worker threads, 
default 100
hive.server2.thrift.port - Tcp port to listen on , default 1
hive.server2.thrift.bind.host - Tcp interface to bind to

  Env 
HIVE_SERVER2_THRIFT_BIND_HOST - optional tcp host interface to bind to. 
Overrides the config file setting
HIVE_SERVER2_THRIFT_PORT - optional tcp port# to listen on, default 1. 
Overrides the config file setting

How to start
  hiveserver2.sh
  OR
  hive --service hiveserver2

Authentication -
  HiveServer2 support Anonymous (no auth), Kerberos, pass through LDAP and 
pluggable custom authentication.
  Configuration -
 hive.server2.authentication - Authentication mode, default NONE. Options 
are NONE, KERBEROS, LDAP and CUSTOM
 hive.server2.authentication.kerberos.principal - Kerberos principal for 
server
 hive.server2.authentication.kerberos.keytab - Keytab for server principal
 hive.server2.authentication.ldap.url - LDAP url
 hive.server2.authentication.ldap.baseDN - LDAP base DN
 hive.server2.custom.authentication.class - Custom authentication class 
that implements org.apache.hive.service.auth.PasswdAuthenticationProvider 
interface

Impersonation -
   By default HiveServer2 performs the query processing as the user that 
started the server process. It can be enabled to impersonate the connected 
user. 
   Configuration -
  hive.server2.enable.impersonation - Impersonate the connected user, 
default false
   OR
  hive.server2.enable.doAs - Impersonate the connected user, default false
  fs.hdfs.impl.disable.cache - Disable filesystem cache, default false


 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603735#comment-13603735
 ] 

Carl Steinbach commented on HIVE-4187:
--

On my machine the build consistently fails during execution of the ql 
build-grammar target
with the following error:

{code}
build-grammar:
 [echo] Project: ql
 [echo] Building Grammar 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  
 [java] ANTLR Parser Generator  Version 3.0.1 (August 13, 2007)  1989-2007
 [java] 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
 [java] 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
 [java] error(100): 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:27:8:
 syntax error: antlr: 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:27:8:
 unexpected token: SelectClauseParser
 [java] error(100): 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:27:44:
 syntax error: antlr: 
/Users/carl/Work/repos/hive-test/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:27:44:
 unexpected token: ,
{code}

This error was introduced by HIVE-4148. It appears that the changes in
HIVE-4148 cause Ant to put antlr-3.0.1.jar on the classpath instead of
antlr-3.4.jar

I'm using Ant 1.8.1


 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4041) Support multiple partitionings in a single Query

2013-03-15 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603738#comment-13603738
 ] 

Phabricator commented on HIVE-4041:
---

hbutani has commented on the revision HIVE-4041 [jira] Support multiple 
partitionings in a single Query.

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java:212
 Yes the translation code could do with your review.
  We were not paying much attention to optimization at the time we wrote it.
  So the TableFuncDef holds onto ShapeDetails(input, output..)
  The Shape class has Serde props that we use to reconstruct the OIs during 
runtime.
  This happens in PTFTranslator. Read the translate for WindowingSpec method 
(line 138) in PTFTranslator.
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java:415
 Just added a Jira 4186 for this.
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java:482 yes I 
agree. I just don't want to make these changes in this Jira. Want to only add 
multi partition support here.
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingComponentizer.java:38 I 
will add more comments. So there is:
  - 1 PTFOp
  - It can contain one or more PTF invocations.
  - When the PTF is WindowTableFunc, it can contain 1 or more UDAFs; they have 
the same partitioning.
  - During translation we create a WindowingSpec for each destination with 
Windowing(over clauses).
  - Here we then componentize the single WindowingSpec into multiple 
WindowingSpecs based on the partitioning.
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingComponentizer.java:85 
Yes the having is to be removed. Haven't gotten around to it. Again didn't want 
to make this change in this Jira. It is on my todo.
  ql/src/test/queries/clientpositive/windowing_multipartitioning.q:21 I added 
all the multipartition tests from the spreadsheet except the ones that have no 
order. Those I will add once we resolve how to handle no order.

REVISION DETAIL
  https://reviews.facebook.net/D9381

To: JIRA, ashutoshc, hbutani


 Support multiple partitionings in a single Query
 

 Key: HIVE-4041
 URL: https://issues.apache.org/jira/browse/HIVE-4041
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4041.D9381.1.patch, WindowingComponentization.pdf


 Currently we disallow queries if the partition specifications of all Wdw fns 
 are not the same. We can relax this by generating multiple PTFOps based on 
 the unique partitionings in a Query. For partitionings that only differ in 
 sort, we can introduce a sort step in between PTFOps, which can happen in the 
 same Reduce task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603739#comment-13603739
 ] 

Prasad Mujumdar commented on HIVE-4184:
---

JDBC -
   Hive includes a new JDBC client driver for HiveServer2. It supports both 
embedded and remote access to HiveServer2.
The JDBC connection URL format has prefix is jdbc:hive2:// and the Driver class 
is  org.apache.hive.jdbc.HiveDriver. Note that this is different from the old 
hiveserver. For remote server, the URL format is 
jdbc:hive2://host:port/db. For embedded server, the URL format is 
jdbc:hive2:// (no host or port)
When connecting to HiveServer2 with kerberos authentication, the URL format is 
jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2. 
The client needs to have a valid Kerberos ticket in the ticket cache before 
connecting. In case of LDAP or customer pass through authentication, the client 
needs to pass the valid user name and password to JDBC connection API.

Beeline Shell -
  Hive includes a new command shell Beeline that works with HiveServer2. Its a 
JDBC client that is based on SQLLine CLI (http://sqlline.sourceforge.net/). 
There’s an detailed documentation of the SQLLine options available at 
http://sqlline.sourceforge.net/#manual which is applicable to Beeline.

 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4185) Document HiveServer2 JDBC and Beeline CLI in the user documentation

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603741#comment-13603741
 ] 

Prasad Mujumdar commented on HIVE-4185:
---

JDBC -
Hive includes a new JDBC client driver for HiveServer2. It supports both 
embedded and remote access to HiveServer2.
The JDBC connection URL format has prefix is jdbc:hive2:// and the Driver class 
is org.apache.hive.jdbc.HiveDriver. Note that this is different from the old 
hiveserver. For remote server, the URL format is 
jdbc:hive2://host:port/db. For embedded server, the URL format is 
jdbc:hive2:// (no host or port)
When connecting to HiveServer2 with kerberos authentication, the URL format is 
jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2. 
The client needs to have a valid Kerberos ticket in the ticket cache before 
connecting. In case of LDAP or customer pass through authentication, the client 
needs to pass the valid user name and password to JDBC connection API.
Beeline Shell -
Hive includes a new command shell Beeline that works with HiveServer2. Its a 
JDBC client that is based on SQLLine CLI (http://sqlline.sourceforge.net/). 
There’s an detailed documentation of the SQLLine options available at 
http://sqlline.sourceforge.net/#manual which is applicable to Beeline.

 Document HiveServer2 JDBC and Beeline CLI in the user documentation 
 

 Key: HIVE-4185
 URL: https://issues.apache.org/jira/browse/HIVE-4185
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The user documentation 
 https://cwiki.apache.org/confluence/display/Hive/HiveClient includes 
 information about client connection to thrift server.
 We need to include HiveServer2 client details (JDBC, Beeline)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603754#comment-13603754
 ] 

Kevin Wilfong commented on HIVE-4145:
-

Could you add an entry to eclipse-templates/.classpath as well for hcatalog?

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4148) Cleanup aisle ivy

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603776#comment-13603776
 ] 

Carl Steinbach commented on HIVE-4148:
--

This patch is causing the build to fail on my machine.
I created HIVE-4187 to track this problem and assigned
it to Gunther.

I am able to fix the failure by adding antlr and antlr-runtime
back to ql/ivy.xml.

I think we should consider reverting this patch for the following
reasons:

* It makes maintenance harder since it converts explicit dependencies into 
transitive ones. For example, hive-exec has a direct compile-time dependency on 
the antlr parser generator, but it now relies on a transitive dependency via 
hive-metastore in order to satisfy this. This is also brittle since it means 
that hive-exec will break if the antlr dependency is removed from 
metastore/ivy.xml.
* I don't see any performance improvement with this change in place. I tried 
doing a fresh build several times with and without HIVE-4148, and the version 
without HIVE-4148 often finishes a couple seconds faster. This is pretty much 
what you would expect since Ivy should be using its local cache to resolve most 
of these dependencies.


 Cleanup aisle ivy
 ---

 Key: HIVE-4148
 URL: https://issues.apache.org/jira/browse/HIVE-4148
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.11.0

 Attachments: HIVE-4148.patch


 Lot's of duplicated dependencies in the modules' ivy configs. Makes compiling 
 slow and maintenance hard. This patch cleans up these dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: subscribe to your lists

2013-03-15 Thread Alan Gates
Send email to user-subscr...@hive.apache.org and dev-subscr...@hive.apache.org.

Alan.

On Mar 7, 2013, at 7:17 AM, Lin Picouleau wrote:

 Hi,
 
 I would like to subscribe to your lists to get involved with Hive project.
 
 Thank you!
 
 Lin Picouleau



[jira] [Created] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-4188:
---

 Summary: TestJdbcDriver2.testDescribeTable failing consistently
 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong


Running in Linux on a clean checkout after running ant very-clean package, the 
test TestJdbcDriver2.testDescribeTable fails consistently with 

Column name 'under_col' not found expected:under_col but was:# col_name 

junit.framework.ComparisonFailure: Column name 'under_col' not found 
expected:under_col but was:# col_name 
at junit.framework.Assert.assertEquals(Assert.java:81)
at 
org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:154)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar reassigned HIVE-4188:
-

Assignee: Prasad Mujumdar

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603788#comment-13603788
 ] 

Prasad Mujumdar commented on HIVE-4188:
---

[~kevinwilfong] I will take a look.

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603797#comment-13603797
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Thanks Prasad

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-4189:
---

 Summary: ORC fails with String column that ends in lots of nulls
 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


When ORC attempts to write out a string column that ends in enough nulls to 
span an index stride, StringTreeWriter's writeStripe method will get an 
exception from TreeWriter's writeStripe method

Column has wrong number of index entries found: x expected: y

This is caused by rowIndexValueCount having multiple entries equal to the 
number of non-null rows in the column, combined with the fact that 
StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4186) NPE in ReduceSinkDeDuplication

2013-03-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603816#comment-13603816
 ] 

Ashutosh Chauhan commented on HIVE-4186:


[~rhbutani] Can you add testcase in a .q file and add it in your patch ?

 NPE in ReduceSinkDeDuplication
 --

 Key: HIVE-4186
 URL: https://issues.apache.org/jira/browse/HIVE-4186
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Harish Butani
 Attachments: HIVE-4186.1.patch.txt


 When you have a sequence of RedueSinks on constants you get this error:
 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.optimizer.ReduceSinkDeDuplication$ReduceSinkDeduplicateProcFactory$ReducerReducerProc.getPartitionAndKeyColumnMapping(ReduceSinkDeDuplication.java:416)
 {noformat}
 The e.g. to generate this si:
 {noformat}
 select p_name from (select p_name from part distribute by 1 sort by 1) p 
 distribute by 1 sort by 1
 {noformat}
 Sorry for the contrived e.g., but this actually happens when we stack 
 windowing clauses (see PTF-Windowing branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603819#comment-13603819
 ] 

Kevin Wilfong commented on HIVE-4189:
-

https://reviews.facebook.net/D9465

 ORC fails with String column that ends in lots of nulls
 ---

 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4189.1.patch.txt


 When ORC attempts to write out a string column that ends in enough nulls to 
 span an index stride, StringTreeWriter's writeStripe method will get an 
 exception from TreeWriter's writeStripe method
 Column has wrong number of index entries found: x expected: y
 This is caused by rowIndexValueCount having multiple entries equal to the 
 number of non-null rows in the column, combined with the fact that 
 StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4189:


Attachment: HIVE-4189.1.patch.txt

 ORC fails with String column that ends in lots of nulls
 ---

 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4189.1.patch.txt


 When ORC attempts to write out a string column that ends in enough nulls to 
 span an index stride, StringTreeWriter's writeStripe method will get an 
 exception from TreeWriter's writeStripe method
 Column has wrong number of index entries found: x expected: y
 This is caused by rowIndexValueCount having multiple entries equal to the 
 number of non-null rows in the column, combined with the fact that 
 StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603831#comment-13603831
 ] 

Prasad Mujumdar commented on HIVE-4184:
---

[~cwsteinbach] I can add these initial versions to wiki and update it based on 
the feedback.

 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4070) Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case insensitive

2013-03-15 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603853#comment-13603853
 ] 

Gwen Shapira commented on HIVE-4070:


Oracle's LIKE (as well as any other char/varchar comparison) is case 
sensitive.
No matter how HiveQL behaves it can't be consistent with every SQL 
implementation out there. 

 Like operator in Hive is case sensitive while in MySQL (and most likely other 
 DBs) it's case insensitive
 

 Key: HIVE-4070
 URL: https://issues.apache.org/jira/browse/HIVE-4070
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.10.0
Reporter: Mark Grover
Assignee: Mark Grover
 Fix For: 0.11.0


 Hive's like operator seems to be case sensitive.
 See 
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLike.java#L164
 However, MySQL's like operator is case insensitive. I don't have other DB's 
 (like PostgreSQL) installed and handy but I am guessing their LIKE is case 
 insensitive as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4162) disable TestBeeLineDriver

2013-03-15 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4162:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Thejas.

 disable TestBeeLineDriver
 -

 Key: HIVE-4162
 URL: https://issues.apache.org/jira/browse/HIVE-4162
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.11.0

 Attachments: HIVE-4162.1.patch


 See HIVE-4161. We should disable the TestBeeLineDriver test cases. In its 
 current state, it was not supposed to be enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4145:
-

Attachment: HIVE-4145.2.patch.txt

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4145. Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9848/
---

(Updated March 15, 2013, 9:34 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

Updated the diff to fix the eclipse template file.


Description
---

This patch creates an hcatalog stub directory. Alan requested this. Once the 
patch is committed I will contact ASFINFRA and request that they grant karma on 
the directory to the hcatalog submodule committers.


This addresses bug HIVE-4145.
https://issues.apache.org/jira/browse/HIVE-4145


Diffs (updated)
-

  build-common.xml cab3942 
  build.properties b96fe60 
  build.xml 9e656d6 
  eclipse-templates/.classpath f5c580b 
  hcatalog/build.xml PRE-CREATION 
  hcatalog/ivy.xml PRE-CREATION 
  hcatalog/src/java/org/apache/hive/hcatalog/package-info.java PRE-CREATION 
  hcatalog/src/test/.gitignore PRE-CREATION 

Diff: https://reviews.apache.org/r/9848/diff/


Testing
---


Thanks,

Carl Steinbach



[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-15 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3958:
---

Attachment: HIVE-3958.patch.1

 support partial scan for analyze command - RCFile
 -

 Key: HIVE-3958
 URL: https://issues.apache.org/jira/browse/HIVE-3958
 Project: Hive
  Issue Type: Improvement
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Attachments: HIVE-3958.patch.1


 analyze commands allows us to collect statistics on existing 
 tables/partitions. It works great but might be slow since it scans all files.
 There are 2 ways to speed it up:
 1. collect stats without file scan. It may not collect all stats but good and 
 fast enough for use case. HIVE-3917 addresses it
 2. collect stats via partial file scan. It doesn't scan all content of files 
 but part of it to get file metadata. some examples are 
 https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
 and HFile of Hbase
 This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile

2013-03-15 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3958:
---

Status: Patch Available  (was: In Progress)

 support partial scan for analyze command - RCFile
 -

 Key: HIVE-3958
 URL: https://issues.apache.org/jira/browse/HIVE-3958
 Project: Hive
  Issue Type: Improvement
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Attachments: HIVE-3958.patch.1


 analyze commands allows us to collect statistics on existing 
 tables/partitions. It works great but might be slow since it scans all files.
 There are 2 ways to speed it up:
 1. collect stats without file scan. It may not collect all stats but good and 
 fast enough for use case. HIVE-3917 addresses it
 2. collect stats via partial file scan. It doesn't scan all content of files 
 but part of it to get file metadata. some examples are 
 https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) 
 and HFile of Hbase
 This jira is targeted to address the #2. More specifically RCFile format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4145. Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9848/
---

(Updated March 15, 2013, 9:47 p.m.)


Review request for hive and Ashutosh Chauhan.


Description
---

This patch creates an hcatalog stub directory. Alan requested this. Once the 
patch is committed I will contact ASFINFRA and request that they grant karma on 
the directory to the hcatalog submodule committers.


This addresses bug HIVE-4145.
https://issues.apache.org/jira/browse/HIVE-4145


Diffs (updated)
-

  .gitignore e5383d4 
  build-common.xml cab3942 
  build.properties b96fe60 
  build.xml 9e656d6 
  common/src/gen/org/apache/hive/common/package-info.java dfba75d 
  eclipse-templates/.classpath f5c580b 
  hcatalog/build.xml PRE-CREATION 
  hcatalog/ivy.xml PRE-CREATION 
  hcatalog/src/java/org/apache/hive/hcatalog/package-info.java PRE-CREATION 
  hcatalog/src/test/.gitignore PRE-CREATION 

Diff: https://reviews.apache.org/r/9848/diff/


Testing
---


Thanks,

Carl Steinbach



[jira] [Updated] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4145:
-

Attachment: HIVE-4145.3.patch.txt

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603890#comment-13603890
 ] 

Carl Steinbach commented on HIVE-4145:
--

@kevinwilfong: I updated the diff. Please take a look. Thanks.

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4190) OVER clauses with ORDER BY not getting windowing set properly

2013-03-15 Thread Alan Gates (JIRA)
Alan Gates created HIVE-4190:


 Summary: OVER clauses with ORDER BY not getting windowing set 
properly
 Key: HIVE-4190
 URL: https://issues.apache.org/jira/browse/HIVE-4190
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates


Given a query like:

select s, avg(f) over (partition by si order by d) from over100k;

Hive is not setting the window frame properly.  The order by creates an 
implicit window frame of 'unbounded preceding' but Hive is treating the above 
query as if it has no window.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603905#comment-13603905
 ] 

Kevin Wilfong commented on HIVE-4145:
-

+1 thanks for fixing the other Eclipse issues

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4191) describe table output always prints as if formatted keyword is specified

2013-03-15 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-4191:
---

 Summary: describe table output always prints as if formatted 
keyword is specified
 Key: HIVE-4191
 URL: https://issues.apache.org/jira/browse/HIVE-4191
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2
Affects Versions: 0.10.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair


With the change in HIVE-3140, describe table output prints like the format 
expected from describe *formatted* table. ie, the headers are included and 
there is padding with space for the fields. 
This is a non backward compatible change, we should discuss if this change in 
the formatting of output should remain. 
This has impact on hiveserver2, it has been relying on the old format, and with 
this change it prints additional headers and fields with space padding.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4191) describe table output always prints as if formatted keyword is specified

2013-03-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4191:


Attachment: HIVE-4191.1.patch

HIVE-4191.1.patch - a patch that changes format of default describe table 
command as hive 0.9

 describe table output always prints as if formatted keyword is specified
 

 Key: HIVE-4191
 URL: https://issues.apache.org/jira/browse/HIVE-4191
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2
Affects Versions: 0.10.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4191.1.patch


 With the change in HIVE-3140, describe table output prints like the format 
 expected from describe *formatted* table. ie, the headers are included and 
 there is padding with space for the fields. 
 This is a non backward compatible change, we should discuss if this change in 
 the formatting of output should remain. 
 This has impact on hiveserver2, it has been relying on the old format, and 
 with this change it prints additional headers and fields with space padding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Travis Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603941#comment-13603941
 ] 

Travis Crawford commented on HIVE-4145:
---

Hey [~cwsteinbach] - this looks good to create the initial directory. There 
will likely be some integration-related build changes but this will let us get 
started. +1

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603970#comment-13603970
 ] 

Gunther Hagleitner commented on HIVE-4179:
--

Query:

{noformat}
insert overwrite table outputTbl1
SELECT a.key, concat(a.values, a.values), concat(a.values, a.values)
FROM (
  SELECT key, count(1) as values from inputTbl1 group by key
  UNION ALL
  SELECT key, count(1) as values from inputTbl1 group by key
) a;
{noformat}

Before:

{noformat}
  outputColumnNames: _col0, _col1
  Select Operator
expressions:
  expr: _col0
  type: string
  expr: UDFToLong(_col1)
  type: bigint
  expr: UDFToLong(_col2)
  type: bigint
outputColumnNames: _col0, _col1, _col2
{noformat}

After:

{noformat}
  outputColumnNames: _col0, _col1
  Select Operator
expressions:
  expr: _col0
  type: string
  expr: UDFToLong(concat(_col1, _col1))
  type: bigint
  expr: UDFToLong(concat(_col1, _col1))
  type: bigint
outputColumnNames: _col0, _col1, _col2
{noformat}

 NonBlockingOpDeDup does not merge SEL operators correctly
 -

 Key: HIVE-4179
 URL: https://issues.apache.org/jira/browse/HIVE-4179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch


 The input columns list for SEL operations isn't merged properly in the 
 optimization. The best way to see this is running union_remove_22.q with 
 -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
 column.
 Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-15 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4179:
-

Status: Patch Available  (was: Open)

 NonBlockingOpDeDup does not merge SEL operators correctly
 -

 Key: HIVE-4179
 URL: https://issues.apache.org/jira/browse/HIVE-4179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch


 The input columns list for SEL operations isn't merged properly in the 
 optimization. The best way to see this is running union_remove_22.q with 
 -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
 column.
 Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3861) Upgrade hbase dependency to 0.94.2

2013-03-15 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603993#comment-13603993
 ] 

Sushanth Sowmyan commented on HIVE-3861:


I think this patch needs to be updated, it's not applying cleanly for me on 
trunk currently.

However, that said, one more fix is required for this patch - it brings in 
yammer which brings in a second version of slf4j, which causes two slf4j jars 
present in the hive/lib directory, which causes other commands like pig 
--useHCatalog to fail as an integration point.

It needs the yammer dependency in hbase-handler/ivy.xml additionally patched as 
follows:

{noformat}
-dependency org=com.yammer.metrics name=metrics-core 
rev=${metrics-core.version}/
+dependency org=com.yammer.metrics name=metrics-core 
rev=${metrics-core.version}
+  exclude org=org.slf4j module=slf4j-api/!--causes a dual slf4j 
presence otherwise--
+/dependency
{noformat}


 Upgrade hbase dependency to 0.94.2
 --

 Key: HIVE-3861
 URL: https://issues.apache.org/jira/browse/HIVE-3861
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-3861.patch


 Hive tests fail to run against hbase v0.94.2. Proposing to upgrade the 
 dependency and change the test setup to properly work with the newer version.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4188:
--

Component/s: HiveServer2

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604005#comment-13604005
 ] 

Prasad Mujumdar commented on HIVE-4188:
---

Review request at https://reviews.facebook.net/D9477

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604006#comment-13604006
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Could you attach the patch to the JIRA and mark it Patch Available if it's 
ready for review.

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar

 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604007#comment-13604007
 ] 

Gunther Hagleitner commented on HIVE-4179:
--

https://reviews.facebook.net/D9471

 NonBlockingOpDeDup does not merge SEL operators correctly
 -

 Key: HIVE-4179
 URL: https://issues.apache.org/jira/browse/HIVE-4179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch


 The input columns list for SEL operations isn't merged properly in the 
 optimization. The best way to see this is running union_remove_22.q with 
 -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one 
 column.
 Note: union_remove tests do not run on hadoop 1 or 0.20.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604008#comment-13604008
 ] 

Carl Steinbach commented on HIVE-4184:
--

@prasadm please go ahead and add this to the wiki. What you have looks good to 
me. One thing I'd like to add is that HiveServer2 was intentionally designed to 
be independent of the RPC layer. Currently we only support Thrift, but in the 
future we should also be able to support Avro or Protobufs without having to 
modify the upper layers of the system.

 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604008#comment-13604008
 ] 

Carl Steinbach edited comment on HIVE-4184 at 3/16/13 12:03 AM:


[~prasadm] please go ahead and add this to the wiki. What you have looks good 
to me. One thing I'd like to add is that HiveServer2 was intentionally designed 
to be independent of the RPC layer. Currently we only support Thrift, but in 
the future we should also be able to support Avro or Protobufs without having 
to modify the upper layers of the system.

  was (Author: cwsteinbach):
@prasadm please go ahead and add this to the wiki. What you have looks good 
to me. One thing I'd like to add is that HiveServer2 was intentionally designed 
to be independent of the RPC layer. Currently we only support Thrift, but in 
the future we should also be able to support Avro or Protobufs without having 
to modify the upper layers of the system.
  
 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4184) Document HiveServer2 setup under the admin documentation on hive wiki

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604020#comment-13604020
 ] 

Carl Steinbach commented on HIVE-4184:
--

One more thing: the hiveserver2 launch script doesn't have a .sh suffix.

 Document HiveServer2 setup under the admin documentation on hive wiki 
 --

 Key: HIVE-4184
 URL: https://issues.apache.org/jira/browse/HIVE-4184
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar

 The setup/configuration instructions for HiveServer are available on  
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+SettingUpHiveServer
 We should include similar details for HiveServer2 configuration

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4192) Use of LEAD in an OVER clauses causes the query to fail

2013-03-15 Thread Alan Gates (JIRA)
Alan Gates created HIVE-4192:


 Summary: Use of LEAD in an OVER clauses causes the query to fail
 Key: HIVE-4192
 URL: https://issues.apache.org/jira/browse/HIVE-4192
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates


Running a query like:

{code}
select i, lead(s) over (partition by bin order by d desc 
rows between current row and 1 following) 
from over100k;
{code}

gives an error:

{code}
FAILED: SemanticException Function lead((TOK_TABLE_OR_COL s)) 
org.apache.hadoop.hive.ql.parse.WindowingSpec$WindowSpec@13e15f7 as _wcol0 
doesn't support windowing
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4145:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.11.0

 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-4145:
-

Component/s: HCatalog

 Create hcatalog stub directory and add it to the build
 --

 Key: HIVE-4145
 URL: https://issues.apache.org/jira/browse/HIVE-4145
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure, HCatalog
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.11.0

 Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
 HIVE-4145.3.patch.txt


 Alan has requested that we create a directory for hcatalog and give the 
 HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4193) OVER clauses with BETWEEN in the window definition produce wrong results

2013-03-15 Thread Alan Gates (JIRA)
Alan Gates created HIVE-4193:


 Summary: OVER clauses with BETWEEN in the window definition 
produce wrong results
 Key: HIVE-4193
 URL: https://issues.apache.org/jira/browse/HIVE-4193
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.11.0
Reporter: Alan Gates


Window queries that define a windowing clause that has a termination row often 
(though not all) return incorrect results.  For example, from our test queries 
all of the following return incorrect results:

{code}
select s, sum(f) over (partition by t order by b 
   rows between current row and unbounded following) 
from over100k;

select s, avg(f) over (partition by b order by d 
   rows between 5 preceding and current row) 
from over100k;

select s, avg(f) over (partition by bin order by s 
   rows between current row and 5 following) 
from over100k;

select s, avg(d) over (partition by i order by f desc 
   rows between 5 preceding and 5 following) 
from over100k;
{code}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL

2013-03-15 Thread Richard Ding (JIRA)
Richard Ding created HIVE-4194:
--

 Summary: JDBC2: HiveDriver should not throw RuntimeException when 
passed an invalid URL
 Key: HIVE-4194
 URL: https://issues.apache.org/jira/browse/HIVE-4194
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.11.0
Reporter: Richard Ding
Assignee: Richard Ding


As per JDBC 3.0 Spec (section 9.2)
If the Driver implementation understands the URL, it will return a 
Connection object; otherwise it returns null

Currently HiveConnection constructor will throw IllegalArgumentException if url 
string doesn't start with jdbc:hive2. This exception should be caught by 
HiveDriver.connect and return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4195) Avro SerDe causes incorrect behavior in unrelated tables

2013-03-15 Thread Skye Wanderman-Milne (JIRA)
Skye Wanderman-Milne created HIVE-4195:
--

 Summary: Avro SerDe causes incorrect behavior in unrelated tables
 Key: HIVE-4195
 URL: https://issues.apache.org/jira/browse/HIVE-4195
 Project: Hive
  Issue Type: Bug
Reporter: Skye Wanderman-Milne


When I run a file that first creates an Avro table using the Avro SerDe, then 
immediately creates an LZO text table and inserts data into the LZO table, the 
resulting LZO table contain Avro data files. When I remove the Avro CREATE 
TABLE statement, the LZO table contains .lzo files as expected.

{noformat}
DROP TABLE IF EXISTS avro_table;
CREATE EXTERNAL TABLE avro_table
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES ('avro.schema.literal' = '{
namespace: testing.hive.avro.serde,
name: test_record,
type: record,
fields: [
{name:int1, type:long},
{name:string1, type:string}
]
}');

DROP TABLE IF EXISTS lzo_table;
CREATE EXTERNAL TABLE lzo_table (
id int,
bool_col boolean,
tinyint_col tinyint,
smallint_col smallint,
int_col int,
bigint_col bigint,
float_col float,
double_col double,
date_string_col string,
string_col string,
timestamp_col timestamp)
STORED AS 
INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
;

SET hive.exec.compress.output=true;
SET mapred.output.compression.type=BLOCK;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.exec.dynamic.partition=true;
SET mapred.max.split.size=25600;
SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
insert overwrite table lzo_table SELECT id, bool_col, tinyint_col, 
smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, 
string_col, timestamp_col FROM src_table;
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4067) Followup to HIVE-701: reduce ambiguity in grammar

2013-03-15 Thread Samuel Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604068#comment-13604068
 ] 

Samuel Yuan commented on HIVE-4067:
---

The primary motivation was that it was tricky to add new keywords, which are 
almost always reserved words by default in Hive, because doing so could easily 
break existing queries. The changes for HIVE-701 make it easy to add 
non-reserved keywords in the future. HIVE-701 also removes the reserved status 
of most keywords, to prevent recently introduced keywords from breaking queries.

I can undo the changes for the Hive keywords which are reserved in SQL 2003, 
but would there be any reason to do so besides to adhere to the standard, given 
that the grammar can support leaving them non-reserved?

 Followup to HIVE-701: reduce ambiguity in grammar
 -

 Key: HIVE-4067
 URL: https://issues.apache.org/jira/browse/HIVE-4067
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4067.D8883.1.patch


 After HIVE-701 the grammar has become much more ambiguous, and the 
 compilation generates a large number of warnings. Making FROM, DISTINCT, 
 PRESERVE, COLUMN, ALL, AND, OR, and NOT reserved keywords again reduces the 
 number of warnings to 134, up from the original 81 warnings but down from the 
 565 after HIVE-701. Most of the remaining ambiguity is trivial, an example 
 being KW_ELEM_TYPE | KW_KEY_TYPE | KW_VALUE_TYPE | identifier, and they are 
 all correctly handled by ANTLR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604070#comment-13604070
 ] 

Gunther Hagleitner commented on HIVE-4187:
--

Looking at this right now. Not sure how to reproduce. Can you share how you 
build? Is it just ant clean package?

 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4188:
--

Attachment: HIVE-4188-1.patch

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar
 Attachments: HIVE-4188-1.patch


 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4188:
--

Status: Patch Available  (was: Open)

Patch attached

 TestJdbcDriver2.testDescribeTable failing consistently
 --

 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Prasad Mujumdar
 Attachments: HIVE-4188-1.patch


 Running in Linux on a clean checkout after running ant very-clean package, 
 the test TestJdbcDriver2.testDescribeTable fails consistently with 
 Column name 'under_col' not found expected:under_col but was:# col_name 
 junit.framework.ComparisonFailure: Column name 'under_col' not found 
 expected:under_col but was:# col_name 
 at junit.framework.Assert.assertEquals(Assert.java:81)
 at 
 org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
 at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
 at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4194:
--

Component/s: HiveServer2

 JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
 --

 Key: HIVE-4194
 URL: https://issues.apache.org/jira/browse/HIVE-4194
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0
Reporter: Richard Ding
Assignee: Richard Ding

 As per JDBC 3.0 Spec (section 9.2)
 If the Driver implementation understands the URL, it will return a 
 Connection object; otherwise it returns null
 Currently HiveConnection constructor will throw IllegalArgumentException if 
 url string doesn't start with jdbc:hive2. This exception should be caught 
 by HiveDriver.connect and return null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604085#comment-13604085
 ] 

Gunther Hagleitner commented on HIVE-4187:
--

Downgraded my ant version to 1.8.1. Cleared all caches and built. Still don't 
see the problem.

 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4172) JDBC2 does not support VOID type

2013-03-15 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4172:
--

Component/s: HiveServer2

 JDBC2 does not support VOID type
 

 Key: HIVE-4172
 URL: https://issues.apache.org/jira/browse/HIVE-4172
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: HiveServer2

 In beeline, select key, null from src fails with exception,
 {noformat}
 org.apache.hive.service.cli.HiveSQLException: Error running query: 
 java.lang.NullPointerException
   at 
 org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:112)
   at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:166)
   at 
 org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
   at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:183)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:39)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4139) MiniDFS shim does not work for hadoop 2

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604090#comment-13604090
 ] 

Gunther Hagleitner commented on HIVE-4139:
--

- Was there a reason to add this flag -XX:+CMSClassUnloadingEnabled

When running the unit tests against the hadoop 2 minidfs/minimr stuff, you run 
out of PermGen space. This setting (as well as the one to increase the PermGen 
space for junit fixes the issue.

- Shim for DFS: There's a change in MiniDFS that changes the return type to a 
sub type of one of the public methods (getFilesystem). API compatible, but not 
binary compatible. By creating the shim, each version will be compiled against 
the right version of hadoop.

- Shim for MiniMRCluster: There was some really ugly code in QTestUtil, that 
used exceptions to distinguish between hadoop 1/2 and initialized the 
MiniMRcluster accordingly. By creating shims for these you can just init that 
stuff differently for each hadoop version (the apis are the same, but the 
behavior and configs have changed)

- Yes, these dependencies are needed. The build system won't pick them up 
otherwise.

 MiniDFS shim does not work for hadoop 2
 ---

 Key: HIVE-4139
 URL: https://issues.apache.org/jira/browse/HIVE-4139
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4139.1.patch, HIVE-4139.2.patch, HIVE-4139.3.patch


 There's an incompatibility between hadoop 1  2 wrt to the MiniDfsCluster 
 class. That causes the hadoop 2 line Minimr tests to fail with a 
 MethodNotFound exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604096#comment-13604096
 ] 

Carl Steinbach commented on HIVE-4187:
--

Yes, I use 'ant clean package'. 

 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604098#comment-13604098
 ] 

Carl Steinbach commented on HIVE-4187:
--

As far as I can tell the changes that were made in HIVE-4148
don't make the build any faster. I also think that relying on
Ivy's transitive dependency resolution mechanism to resolve
direct dependencies is a really bad idea, and actually makes
the build harder to maintain. Please let me know if you have
evidence to the contrary. Otherwise I would appreciate it if
you would resolve this problem by reverting HIVE-4148.

 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4196) Support for Streaming Partitions in Hive

2013-03-15 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-4196:
-

 Summary: Support for Streaming Partitions in Hive
 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik


Motivation: Allow Hive users to immediately query data streaming in through 
clients such as Flume.


Currently Hive partitions must be created after all the data for the partition 
is available. Thereafter, data in the partitions is considered immutable. 

This proposal introduces the notion of a streaming partition into which new 
files an be committed periodically and made available for queries before the 
partition is closed and converted into a standard partition.

The admin enables streaming partition on a table using DDL. He provides the 
following pieces of information:
- Name of the partition in the table on which streaming is enabled
- Frequency at which the streaming partition should be closed and converted 
into a standard partition.

Tables with streaming partition enabled will be partitioned by one and only one 
column. It is assumed that this column will contain a timestamp.

Closing the current streaming partition converts it into a standard partition. 
Based on the specified frequency, the current streaming partition  is closed 
and a new one created for future writes. This is referred to as 'rolling the 
partition'.


A streaming partition's life cycle is as follows:

 - A new streaming partition is instantiated for writes

 - Streaming clients request (via webhcat) for a HDFS file name into which they 
can write a chunk of records for a specific table.

 - Streaming clients write a chunk (via webhdfs) to that file and commit it(via 
webhcat). Committing merely indicates that the chunk has been written 
completely and ready for serving queries.  

 - When the partition is rolled, all committed chunks are swept into single 
directory and a standard partition pointing to that directory is created. The 
streaming partition is closed and new streaming partition is created. Rolling 
the partition is atomic. Streaming clients are agnostic of partition rolling.  

 - Hive queries will be able to query the partition that is currently open for 
streaming. only committed chunks will be visible. read consistency will be 
ensured so that repeated reads of the same partition will be idempotent for the 
lifespan of the query.



Partition rolling requires an active agent/thread running to check when it is 
time to roll and trigger the roll. This could be either be achieved by using an 
external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4187) QL build-grammar target fails after HIVE-4148

2013-03-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604118#comment-13604118
 ] 

Gunther Hagleitner commented on HIVE-4187:
--

I saw your comments on HIVE-4148 as well. Yes, I see your point, but I do think 
that dependencies specified in fewer places leads to something that's easier to 
maintain. 

My thinking was:

- I believe the modules aren't/can't be built in isolation anyways, so why 
duplicate all the deps 
- Nothing seems to enforce that each module has a complete set of deps, so it's 
best effort at best
- I see duplicated hacks and broken pom file stuff in the ivy scripts that 
I'd rather have in one place. 

The build speed is secondary, but I also don't understand how your system is 
faster with more dependencies to resolve than with fewer.

I am not married to HIVE-4148 either, although if the decision is to enforce 
that each module specifies all it deps directly, I'd like to go over the patch 
again and see what can and can't be removed rather than just reverting.

Having said that, I was simply jumping on this one first, because a broken 
build seems more urgent and I want to fix that right away. However, in order to 
do that I would like to know how your build is broken. Ivy should pick up the 
right version, the build machine doesn't have the problem and I can't reproduce 
it. Do you have any pointers?

 QL build-grammar target fails after HIVE-4148
 -

 Key: HIVE-4187
 URL: https://issues.apache.org/jira/browse/HIVE-4187
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Gunther Hagleitner
Priority: Critical



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira