[jira] [Created] (HIVE-12221) Concurrency issue in HCatUtil.getHiveMetastoreClient()

2015-10-21 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-12221:
--

 Summary: Concurrency issue in HCatUtil.getHiveMetastoreClient() 
 Key: HIVE-12221
 URL: https://issues.apache.org/jira/browse/HIVE-12221
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik


HCatUtil.getHiveMetastoreClient()  uses double checked locking pattern
to implement singleton, which is a broken pattern



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12003) Hive Streaming API : Add check to ensure table is transactional

2015-09-30 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-12003:
--

 Summary: Hive Streaming API : Add check to ensure table is 
transactional
 Key: HIVE-12003
 URL: https://issues.apache.org/jira/browse/HIVE-12003
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: Roshan Naik
Assignee: Roshan Naik


Check if TBLPROPERTIES ('transactional'='true') is set when opening connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-11983) Hive streaming API's uses incorrect logic to assign buckets to incoming records

2015-09-28 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-11983:
--

 Summary: Hive streaming API's uses incorrect logic to assign 
buckets to incoming records
 Key: HIVE-11983
 URL: https://issues.apache.org/jira/browse/HIVE-11983
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.1
Reporter: Roshan Naik
Assignee: Roshan Naik


The Streaming API tries to distribute records evenly into buckets. 
All records in every Transaction that is part of TransactionBatch goes to the 
same bucket and a new bucket number is chose for each TransactionBatch.

Fix: API needs to hash each record to determine which bucket it belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-30 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189697#comment-14189697
 ] 

Roshan Naik commented on HIVE-8629:
---

I added the doc to  
https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8629.patch, HIVE-8629.v2.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-30 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189705#comment-14189705
 ] 

Roshan Naik commented on HIVE-8629:
---

before should not be there.. just deleted it. thanks for catching it

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.14.0

 Attachments: HIVE-8629.patch, HIVE-8629.v2.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-28 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187329#comment-14187329
 ] 

Roshan Naik commented on HIVE-8629:
---

[~alangates]  the without setugi, the directories created by the metastore 
during add partition etc are done as hive user instead of the client user of 
the metastore process, consequently leading to incorrect permissions and later 
failure to stream to those directories.

WRT Log.info(): Since this done each time a new connection is created, which 
occurs multiple times over the duration of a long running streaming process, to 
reduce noise in the log output .. i am wondering if we should document this 
instead of log.info()  ? Either I am fine. Let me know what you think.

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Attachments: HIVE-8629.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-28 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8629:
--
Attachment: HIVE-8629.v2.patch

Revised patch incorporating Alan's comments

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Attachments: HIVE-8629.patch, HIVE-8629.v2.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8629:
--
Summary: Streaming / ACID : hive cli session creation takes too long and 
times out if execution engine is tez  (was: Streaming / ACID : hive cli session 
creation takes too long and times out of execution engine is tez)

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming

 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out of execution engine is tez

2014-10-27 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-8629:
-

 Summary: Streaming / ACID : hive cli session creation takes too 
long and times out of execution engine is tez
 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik


When creating a hive session to run basic alter table create partition  
queries, the session creation takes too long (more than 5 sec)  if the hive 
execution engine is set to tez.

Since the streaming clients dont care about Tez , it can explicitly override 
the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8629:
--
Attachment: HIVE-8629.patch

Uploading patch

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Attachments: HIVE-8629.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8629) Streaming / ACID : hive cli session creation takes too long and times out if execution engine is tez

2014-10-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8629:
--
Status: Patch Available  (was: Open)

 Streaming / ACID : hive cli session creation takes too long and times out if 
 execution engine is tez
 

 Key: HIVE-8629
 URL: https://issues.apache.org/jira/browse/HIVE-8629
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Attachments: HIVE-8629.patch


 When creating a hive session to run basic alter table create partition  
 queries, the session creation takes too long (more than 5 sec)  if the hive 
 execution engine is set to tez.
 Since the streaming clients dont care about Tez , it can explicitly override 
 the setting to mr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8476) JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with Kerberos

2014-10-15 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-8476:
-

 Summary: JavaDoc updates to HiveEndPoint.newConnection() for 
secure streaming with Kerberos
 Key: HIVE-8476
 URL: https://issues.apache.org/jira/browse/HIVE-8476
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0


Add additional notes on using kerberos authenticated streaming connection in 
HiveEndPoint.newConnection() method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8476) JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with Kerberos

2014-10-15 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8476:
--
Status: Patch Available  (was: Open)

 JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with 
 Kerberos
 --

 Key: HIVE-8476
 URL: https://issues.apache.org/jira/browse/HIVE-8476
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0

 Attachments: HIVE-8476.patch


 Add additional notes on using kerberos authenticated streaming connection in 
 HiveEndPoint.newConnection() method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8476) JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with Kerberos

2014-10-15 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8476:
--
Attachment: HIVE-8476.patch

 JavaDoc updates to HiveEndPoint.newConnection() for secure streaming with 
 Kerberos
 --

 Key: HIVE-8476
 URL: https://issues.apache.org/jira/browse/HIVE-8476
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0

 Attachments: HIVE-8476.patch


 Add additional notes on using kerberos authenticated streaming connection in 
 HiveEndPoint.newConnection() method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8427) Hive Streaming : throws NPE seen when streaming to secure Hive

2014-10-10 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-8427:
-

 Summary: Hive Streaming :  throws NPE seen when streaming to 
secure Hive
 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0




{code}
2014-10-08 08:13:48,745 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR 
- org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to 
deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException
at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:375)
at 
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:265)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:238)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:175)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:152)
at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:293)
at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:290)
at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:347)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
Summary: Hive Streaming :  secure streaming hangs.  (was: Hive Streaming :  
throws NPE seen when streaming to secure Hive)

 Hive Streaming :  secure streaming hangs.
 -

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0


 {code}
 2014-10-08 08:13:48,745 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
 [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] 
 Unable to deliver event. Exception follows.
 org.apache.flume.EventDeliveryException: java.lang.NullPointerException
   at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:375)
   at 
 org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:265)
   at 
 org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:238)
   at 
 org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:175)
   at 
 org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:152)
   at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:293)
   at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:290)
   at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:347)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs leading to time outs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
Summary: Hive Streaming :  secure streaming hangs leading to time outs.  
(was: Hive Streaming :  secure streaming hangs.)

 Hive Streaming :  secure streaming hangs leading to time outs.
 --

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0


 The enableSasl setting 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
Description: The enableSasl setting   (was: 

{code}
2014-10-08 08:13:48,745 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR 
- org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] Unable to 
deliver event. Exception follows.
org.apache.flume.EventDeliveryException: java.lang.NullPointerException
at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:375)
at 
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:265)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.init(HiveEndPoint.java:238)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:175)
at 
org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:152)
at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:293)
at org.apache.flume.sink.hive.HiveWriter$6.call(HiveWriter.java:290)
at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:347)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
... 1 more
{code})

 Hive Streaming :  secure streaming hangs.
 -

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0


 The enableSasl setting 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs leading to time outs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
Attachment: HIVE-8427.patch

Patch sets METASTORE_USE_THRIFT_SASL for secure streaming

 Hive Streaming :  secure streaming hangs leading to time outs.
 --

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
 Fix For: 0.14.0

 Attachments: HIVE-8427.patch


 The enableSasl setting 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs leading to time outs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
   Labels: ACID Streaming  (was: )
Affects Version/s: 0.14.0
   Status: Patch Available  (was: Open)

 Hive Streaming :  secure streaming hangs leading to time outs.
 --

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.14.0

 Attachments: HIVE-8427.patch


 The enableSasl setting 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8427) Hive Streaming : secure streaming hangs leading to time outs.

2014-10-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-8427:
--
Description: Need to enable Thrift Sasl setting  for secure mode 
communcation  (was: The enableSasl setting )

 Hive Streaming :  secure streaming hangs leading to time outs.
 --

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.14.0

 Attachments: HIVE-8427.patch


 Need to enable Thrift Sasl setting  for secure mode communcation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8427) Hive Streaming : secure streaming hangs leading to time outs.

2014-10-10 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167763#comment-14167763
 ] 

Roshan Naik commented on HIVE-8427:
---

Not a security expert. But here is my thoughts... 

The streaming client (like flume) does not run on the hadoop cluster. It may in 
fact be streaming to one or more clusters.

The reason for using ugi.hasKerberosCredentials() is not to detect if the 
hadoop cluster is security enabled, but to check if the streaming client has 
decided to use secure mode connection (by initializing kerberos on the ugi 
object). The client should be able to maintain  multiple connections .. one to 
a secure cluster, and another to a non-secure cluster.

 Hive Streaming :  secure streaming hangs leading to time outs.
 --

 Key: HIVE-8427
 URL: https://issues.apache.org/jira/browse/HIVE-8427
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.14.0

 Attachments: HIVE-8427.patch


 Need to enable Thrift Sasl setting  for secure mode communcation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-09-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136516#comment-14136516
 ] 

Roshan Naik commented on HIVE-7508:
---

[~leftylev] FYI... I have updated the wiki 

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-09-03 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120448#comment-14120448
 ] 

Roshan Naik commented on HIVE-7508:
---

[~leftylev]. Yes Thanks for bringing it up. I will work with [~alangates] on 
updating that.

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-09-03 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120854#comment-14120854
 ] 

Roshan Naik commented on HIVE-7508:
---

[~leftylev] or [~ashutoshc] can you grant me write permission on that wiki ?

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming, TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7508) Kerberos support for streaming

2014-07-25 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075028#comment-14075028
 ] 

Roshan Naik commented on HIVE-7508:
---

Above errors are unrelated to patch.

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7508) Kerberos support for streaming

2014-07-24 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-7508:
-

 Summary: Kerberos support for streaming
 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik


Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7508) Kerberos support for streaming

2014-07-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7508:
--

Attachment: HIVE-7508.patch

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7508) Kerberos support for streaming

2014-07-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7508:
--

Status: Patch Available  (was: Open)

 Kerberos support for streaming
 --

 Key: HIVE-7508
 URL: https://issues.apache.org/jira/browse/HIVE-7508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
Reporter: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7508.patch


 Add kerberos support for streaming to secure Hive cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7192) Hive Streaming - Some required settings are not mentioned in the documentation

2014-06-06 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-7192:
-

 Summary: Hive Streaming - Some required settings are not mentioned 
in the documentation
 Key: HIVE-7192
 URL: https://issues.apache.org/jira/browse/HIVE-7192
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Roshan Naik
Assignee: Roshan Naik


Specifically:
 - hive.support.concurrency on metastore
 - hive.vectorized.execution.enabled for query client





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7192) Hive Streaming - Some required settings are not mentioned in the documentation

2014-06-06 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7192:
--

Attachment: HIVE-7192.patch

uploading patch

 Hive Streaming - Some required settings are not mentioned in the documentation
 --

 Key: HIVE-7192
 URL: https://issues.apache.org/jira/browse/HIVE-7192
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7192.patch


 Specifically:
  - hive.support.concurrency on metastore
  - hive.vectorized.execution.enabled for query client



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7192) Hive Streaming - Some required settings are not mentioned in the documentation

2014-06-06 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7192:
--

Status: Patch Available  (was: Open)

 Hive Streaming - Some required settings are not mentioned in the documentation
 --

 Key: HIVE-7192
 URL: https://issues.apache.org/jira/browse/HIVE-7192
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7192.patch


 Specifically:
  - hive.support.concurrency on metastore
  - hive.vectorized.execution.enabled for query client



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-06-06 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: (was: Hive Streaming Ingest API for v4 patch.pdf)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, HIVE-5687.v7.patch, Hive Streaming Ingest API for v3 
 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-06-06 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: Hive Streaming Ingest API for v4 patch.pdf

updating 'Hive Streaming Ingest API for v4 patch.pdf'
  document with requirements

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, HIVE-5687.v7.patch, Hive Streaming Ingest API for v3 
 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7153) HiveStreamin - Minor bug in TransactionBatch.toString() method

2014-05-30 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-7153:
-

 Summary: HiveStreamin - Minor bug in TransactionBatch.toString() 
method
 Key: HIVE-7153
 URL: https://issues.apache.org/jira/browse/HIVE-7153
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik


The toString() method current returns :

{code}
return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
txnIds.get(txnIds.size()-1)
  + ] on endPoint=  + endPt;
{code}


The src/gen/thrift there is a typo and needs to replaced with  ...





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7153) HiveStreamin - Minor bug in TransactionBatch.toString() method

2014-05-30 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7153:
--

Description: 
The TransactionBatchImpl.toString() method currently returns :

{code}
return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
txnIds.get(txnIds.size()-1)
  + ] on endPoint=  + endPt;
{code}


The src/gen/thrift there is a typo and needs to replaced with  ...



  was:
The toString() method current returns :

{code}
return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
txnIds.get(txnIds.size()-1)
  + ] on endPoint=  + endPt;
{code}


The src/gen/thrift there is a typo and needs to replaced with  ...




 HiveStreamin - Minor bug in TransactionBatch.toString() method
 --

 Key: HIVE-7153
 URL: https://issues.apache.org/jira/browse/HIVE-7153
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming

 The TransactionBatchImpl.toString() method currently returns :
 {code}
 return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
 txnIds.get(txnIds.size()-1)
   + ] on endPoint=  + endPt;
 {code}
 The src/gen/thrift there is a typo and needs to replaced with  ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7153) HiveStreamin - Minor bug in TransactionBatch.toString() method

2014-05-30 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7153:
--

Attachment: HIVE-7153.patch

uploading patch

 HiveStreamin - Minor bug in TransactionBatch.toString() method
 --

 Key: HIVE-7153
 URL: https://issues.apache.org/jira/browse/HIVE-7153
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7153.patch


 The TransactionBatchImpl.toString() method currently returns :
 {code}
 return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
 txnIds.get(txnIds.size()-1)
   + ] on endPoint=  + endPt;
 {code}
 The src/gen/thrift there is a typo and needs to replaced with  ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7153) HiveStreamin - Minor bug in TransactionBatch.toString() method

2014-05-30 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7153:
--

Affects Version/s: 0.13.0
   Status: Patch Available  (was: Open)

 HiveStreamin - Minor bug in TransactionBatch.toString() method
 --

 Key: HIVE-7153
 URL: https://issues.apache.org/jira/browse/HIVE-7153
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7153.patch


 The TransactionBatchImpl.toString() method currently returns :
 {code}
 return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
 txnIds.get(txnIds.size()-1)
   + ] on endPoint=  + endPt;
 {code}
 The src/gen/thrift there is a typo and needs to replaced with  ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7153) HiveStreaming - Bug in TransactionBatch.toString() method

2014-05-30 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-7153:
--

Summary: HiveStreaming - Bug in TransactionBatch.toString() method  (was: 
HiveStreamin - Minor bug in TransactionBatch.toString() method)

 HiveStreaming - Bug in TransactionBatch.toString() method
 -

 Key: HIVE-7153
 URL: https://issues.apache.org/jira/browse/HIVE-7153
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming
 Attachments: HIVE-7153.patch


 The TransactionBatchImpl.toString() method currently returns :
 {code}
 return TxnIds=[ + txnIds.get(0) + src/gen/thrift + 
 txnIds.get(txnIds.size()-1)
   + ] on endPoint=  + endPt;
 {code}
 The src/gen/thrift there is a typo and needs to replaced with  ...



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6890) Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side

2014-04-11 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966966#comment-13966966
 ] 

Roshan Naik commented on HIVE-6890:
---

Test failure is unrelated to patch.

 Bug in HiveStreaming API causes problems if hive-site.xml is missing on 
 streaming client side
 -

 Key: HIVE-6890
 URL: https://issues.apache.org/jira/browse/HIVE-6890
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-6890.patch


 Incorrect conf object being passed to MetaStore client in 
 AbstractRecordWriter  is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-10 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965129#comment-13965129
 ] 

Roshan Naik commented on HIVE-5687:
---

[~leftylev] Yes looks like it went unnoticed due to the short time frame. For 
some reason i never got a notification of your review. We can get it in via 
another patch... but it appears to be too late to get it into this release.

[~orahive] You can query the data while it is being streamed into Hive. Queries 
will always see a consistent view of the data as this feature relies on the new 
ACID support in Hive. So queries will not see new data that was committed after 
they began executing. 

FLUME-1734 consumes this API to implement a Flume sink that streams data 
continuously into Hive.

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, HIVE-5687.v7.patch, Hive Streaming Ingest API for v3 
 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6890) Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side

2014-04-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-6890:
--

Attachment: HIVE-6890.patch

uploading patch

 Bug in HiveStreaming API causes problems if hive-site.xml is missing on 
 streaming client side
 -

 Key: HIVE-6890
 URL: https://issues.apache.org/jira/browse/HIVE-6890
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-6890.patch


 Incorrect conf object being passed to MetaStore client in 
 AbstractRecordWriter  is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6890) Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side

2014-04-10 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-6890:
-

 Summary: Bug in HiveStreaming API causes problems if hive-site.xml 
is missing on streaming client side
 Key: HIVE-6890
 URL: https://issues.apache.org/jira/browse/HIVE-6890
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-6890.patch

Incorrect conf object being passed to MetaStore client in AbstractRecordWriter  
is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6890) Bug in HiveStreaming API causes problems if hive-site.xml is missing on streaming client side

2014-04-10 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-6890:
--

Status: Patch Available  (was: Open)

 Bug in HiveStreaming API causes problems if hive-site.xml is missing on 
 streaming client side
 -

 Key: HIVE-6890
 URL: https://issues.apache.org/jira/browse/HIVE-6890
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-6890.patch


 Incorrect conf object being passed to MetaStore client in 
 AbstractRecordWriter  is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-09 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v7.patch

patch v7 using package.html from Owen and fixing a bug in packaging

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, HIVE-5687.v7.patch, Hive Streaming Ingest API for v3 
 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963560#comment-13963560
 ] 

Roshan Naik commented on HIVE-5687:
---

Owen: Thanks a lot for revising package.html

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-04-08 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963820#comment-13963820
 ] 

Roshan Naik commented on HIVE-5687:
---

I had posted the revised patch on RB

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf, package.html


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-07 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v6.patch

Addressing review comments from Alan, Owen and some of Lars.

Owen: DDL was used there mostly for convenience and correctness. The other 
places where API is used, cannot be accomplished via DDL.

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
 HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
 Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-04 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: (was: HIVE-5687.v5.patch)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, Hive Streaming Ingest API for v3 
 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-04 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v5.patch

refreshing patch v5 with minor fix to compile with hadoop1 profile

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ACID, Streaming
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, Hive Streaming 
 Ingest API for v3 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-03 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v5.patch

v5 patch addresses  Owen's comments - fixes for unit test issues

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, Hive Streaming 
 Ingest API for v3 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-03 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

 Tags: Streaming ACID 
Fix Version/s: 0.13.0
   Labels: ACID Streaming  (was: )
 Release Note: New transactional APIs to support Streaming data directly 
into Hive.
   Status: Patch Available  (was: Open)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: Streaming, ACID
 Fix For: 0.13.0

 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
 HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
 HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, Hive Streaming 
 Ingest API for v3 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-01 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: Hive Streaming Ingest API for v4 patch.pdf
HIVE-5687.v4.patch

v4 patch .. Adding JSON writer suport, tweaks to JavaDocs.
Updated pdf Document

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch, HIVE-5687.v3.patch, HIVE-5687.v4.patch, Hive Streaming 
 Ingest API for v3 patch.pdf, Hive Streaming Ingest API for v4 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-03-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v3.patch

- Addressed review comments from Alan. 
- Tweaked the APIs
- Wrote Java docs
- Added heartbeat support
- Improved log messages
- More tests
- Fixes for multiple bugs found during unit testing and manual testing

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch, HIVE-5687.v3.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-03-27 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: Hive Streaming Ingest API for v3 patch.pdf

Adding design  spec documentation for v3 patch

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch, HIVE-5687.v3.patch, Hive Streaming Ingest API for v3 
 patch.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-2442) Metastore upgrade script and schema DDL for Hive 0.8.0

2014-03-19 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik reassigned HIVE-2442:
-

Assignee: Roshan Naik  (was: Carl Steinbach)

 Metastore upgrade script and schema DDL for Hive 0.8.0
 --

 Key: HIVE-2442
 URL: https://issues.apache.org/jira/browse/HIVE-2442
 Project: Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Roshan Naik
Priority: Blocker
 Fix For: 0.8.0

 Attachments: HIVE-2442-branch-08.1.patch.txt, 
 HIVE-2442-trunk.1.patch.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-03-04 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-5317

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: (was: HIVE-5687.v2.patch)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v2.patch

updating patch v2 with minor tweaks

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: 5687-api-spec4.docx
HIVE-5687.v2.patch

Revising API in patch and Spec to handle mapping of incoming data format to 
corresponding cols in table (RecordWriter interface). Adding out of the box 
support for Delimited text formats. More formats are pluggable.

Added support for auto creation of new partitions for streaming clients

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.docx, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: (was: 5687-api-spec4.docx)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-draft-api-spec.pdf, 5687-draft-api-spec2.pdf, 
 5687-draft-api-spec3.pdf, HIVE-5687.patch, HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-24 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: 5687-api-spec4.pdf

fixing typos in spec v4

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-11 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.patch

Initial patch  (depends upon HIVE-5843   HIVE-6060)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-draft-api-spec.pdf, 5687-draft-api-spec2.pdf, 
 HIVE-5687.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-11 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: 5687-draft-api-spec3.pdf

spec updated to match first draft patch

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-draft-api-spec.pdf, 5687-draft-api-spec2.pdf, 
 5687-draft-api-spec3.pdf, HIVE-5687.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6315) MetaStoreDirectSql ctor should not throw

2014-01-27 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883136#comment-13883136
 ] 

Roshan Naik commented on HIVE-6315:
---

I see the following exception with this patch applied:

org.datanucleus.api.jdo.exceptions.ClassNotPersistenceCapableException: The 
class org.apache.hadoop.hive.metastore.model.MVersionTable is not 
persistable. This means that it either hasnt been enhanced, or that the 
enhanced version of the file is not in the CLASSPATH (or is hidden by an 
unenhanced version), or the Meta-Data/annotations for the class are not found.
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:380)
at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)
at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)
at 
org.apache.hadoop.hive.metastore.ObjectStore.setMetaStoreSchemaVersion(ObjectStore.java:6025)
at 
org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:5935)
at 
org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:5913)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122)
at com.sun.proxy.$Proxy7.verifySchema(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:389)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:427)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:314)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:274)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.init(RetryingHMSHandler.java:54)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4175)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:115)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:98)
at org.apache.hive.streaming.TestStreaming.setup(TestStreaming.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:27)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:24)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:77)
at 

[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-01-23 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: 5687-draft-api-spec2.pdf

revising draft spec

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-draft-api-spec.pdf, 5687-draft-api-spec2.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-01-20 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: 5687-draft-api-spec.pdf

Attaching draft api spec for comments

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-draft-api-spec.pdf


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-4196) Support for Streaming Partitions in Hive

2013-10-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik resolved HIVE-4196.
---

Resolution: Won't Fix

In view of the HIVE-5317 which brings in insert/update/delete support to Hive, 
the need for introducing streaming partitions is no longer necessary. Streaming 
support can be provided with a far less complexity by leveraging HIVE-5317

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.pdf, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HIVE-5687) Streaming support in Hive

2013-10-29 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5687:
-

 Summary: Streaming support in Hive
 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik


Implement support for Streaming data into HIVE.
- Provide a client streaming API 
- Transaction support: Clients should be able to periodically commit a batch of 
records atomically
- Immediate visibility: Records should be immediately visible to queries on 
commit
- Should not overload HDFS with too many small files

Use Cases:
 - Streaming logs into HIVE via Flume
 - Streaming results of computational from Storm



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2013-10-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Description: 
Implement support for Streaming data into HIVE.
- Provide a client streaming API 
- Transaction support: Clients should be able to periodically commit a batch of 
records atomically
- Immediate visibility: Records should be immediately visible to queries on 
commit
- Should not overload HDFS with too many small files

Use Cases:
 - Streaming logs into HIVE via Flume
 - Streaming results of computations from Storm

  was:
Implement support for Streaming data into HIVE.
- Provide a client streaming API 
- Transaction support: Clients should be able to periodically commit a batch of 
records atomically
- Immediate visibility: Records should be immediately visible to queries on 
commit
- Should not overload HDFS with too many small files

Use Cases:
 - Streaming logs into HIVE via Flume
 - Streaming results of computational from Storm


 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik

 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HIVE-5687) Streaming support in Hive

2013-10-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik reassigned HIVE-5687:
-

Assignee: Roshan Naik

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik

 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-4196) Support for Streaming Partitions in Hive

2013-10-29 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808548#comment-13808548
 ] 

Roshan Naik commented on HIVE-4196:
---

Moving the streaming work to a new jira HIVE-5687 since it will be based on a 
different design.

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.pdf, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776946#comment-13776946
 ] 

Roshan Naik commented on HIVE-5138:
---

Capturing API related comments from [~ashutoshc] noted 
[here|https://issues.apache.org/jira/browse/HIVE-4196?focusedCommentId=13770314page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13770314]
 in HIVE-4196

{quote}
We should try to eliminate the need of intermediate staging area while rolling 
on new partitions. Seems like there should not be any gotchas while moving data 
from streaming dir to partition dir directly.
We should make thrift apis in metastore forward compatible. One way to do that 
is to use struct (which contains all parameters) instead of passing in list of 
arguments.
We should try to leave TBLS table untouched in backend db. That will simplify 
upgrade story. One way to do that is to have all new columns in a new table and 
than add constraints for this new table.
{quote}

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch, HIVE-5138.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4196) Support for Streaming Partitions in Hive

2013-09-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776952#comment-13776952
 ] 

Roshan Naik commented on HIVE-4196:
---

Thanks Ashutosh. Since your recommendations apply to subtask HIVE-5138, I have 
copied ur comments over to it. I will address them there.

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.pdf, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776954#comment-13776954
 ] 

Roshan Naik commented on HIVE-5138:
---

bq.  We should try to eliminate the need of intermediate staging area while 
rolling on new partitions. Seems like there should not be any gotchas while 
moving data from streaming dir to partition dir directly.

Thanks. That change is already part of the patch.

bq. We should make thrift apis in metastore forward compatible. One way to do 
that is to use struct (which contains all parameters) instead of passing in 
list of arguments.

Hate it .. but Ok. :-)


bq. We should try to leave TBLS table untouched in backend db. 
Sure. Will move them to a new table.



 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch, HIVE-5138.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5138) Streaming - Web HCat API

2013-09-17 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5138:
--

Attachment: HIVE-5138.v1.patch

Patch address comments from Eugene, additional unit test, some additional 
checks in partitionRoll for better error reporting.

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch, HIVE-5138.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769209#comment-13769209
 ] 

Roshan Naik commented on HIVE-5138:
---

Thanks [~ekoifman] for the comments:

h5. On Pt 1.
 Thanks. I need to take a closer look at this.


h5. On Pt 2.
 I think you mean 'safe to invoke concurrently' instead of 'atomic', since the 
intermediate states are going to be visible when an operation spans both file 
system and meta store. Here is a summary of the reasons why each operation is 
concurrency safe:

 - *streamingStatus* : Readonly metastore operation
 - *chunkGet* : This is an atomic metastore operation 
 - *chunkAbort* : Just deletes a file. So no concurrency issues here.
 - *chunkCommit* : Just renames a file. So only one of concurrent operations 
will succeed.
 - *disableStreaming* : This is an atomic metastore operation 
 - *enableStreaming* : Does a couple of mkdirs (for setup) followed by an 
atomic metastore operation. mkdirs() is idempotent, so all concurrent calls 
succeed. All concurrent invocations enter a transaction to do the metastore 
update atomically...only one should update metastore.

 - *partitionRoll* : Creates empty dir for the new current partition  then 
atomically updates metastore as follows:
   -# Make note of this new current partition dir
   -# Do an addPartition() on the previous current partition. 

- If concurrent partitionRoll() invocations use same arguments, the 
addPartition() step will fail on all but one. If arguments are not same in 
concurrent invocations, they all succeed and updates made by the last 
invocation to exit the metastore transaction would override the others.




 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5138) Streaming - Web HCat API

2013-09-06 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5138:
--

Attachment: HIVE-4196.v2.patch

Patch v2 is based on git commit version 9e9e711

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-06 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760863#comment-13760863
 ] 

Roshan Naik commented on HIVE-5138:
---

Patch v2 addresses the review comments from 
https://issues.apache.org/jira/browse/HIVE-4196?focusedCommentId=13714235page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13714235

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4196) Support for Streaming Partitions in Hive

2013-08-28 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752946#comment-13752946
 ] 

Roshan Naik commented on HIVE-4196:
---


{quote}  According to the Hive coding conventions lines should be bounded at 
100 characters. Many lines in this patch exceed that. {quote}

Will fix the ones which are not in the thrift generated files.

{quote} I'm surprised to see that streamingStatus sets the chunk id for the 
table. {quote}

Seems like a bug. Will fix.

{quote}  The logic at the end of of these functions doesn't look right. Take 
getNextChunkID for example. If commitTransaction fails (line 2132) rollback 
will be called but the next chunk id will still be returned. It seems you need 
a check on success after commit. I realize many of the calls in the class 
follow this, but it doesn't seem right. {quote}

Good catch. At the time I thought commitTxn() will only fail with an exception 
 does not return false. But on closer inspection there is indeed a corner case 
(if rollBack was called) that it returns false also. Its a bizzare thing for a 
function to fail with  without exceptions. But for now I will fix my code to 
live with it.

{quote} In HiveMetaStoreClient.java, is assert what you want? Are you ok with 
the validity of the arguments not being checked most of the time?{quote}

Not all checks are in place. There is some checks that will happen at lower 
layers. Some at higher. Will be adding more checks.


{quote} I'm trying to figure out whether the chunk files are moved, deleted, or 
left alone during the partition rolling. {quote}

That would depend on whether the table is defined to be an external or internal 
table. It is essentially an add_partition of the new partition. It calls 
HiveMetastore.add_partition_core_notxn()  inside a transaction.



 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.pdf, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think 

[jira] [Commented] (HIVE-5107) Change hive's build to maven

2013-08-28 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753233#comment-13753233
 ] 

Roshan Naik commented on HIVE-5107:
---

curious .. is ant's  'makepom' task (to convert a ivy file into a pom file) a 
useful starting point for such an effort ?

 Change hive's build to maven
 

 Key: HIVE-5107
 URL: https://issues.apache.org/jira/browse/HIVE-5107
 Project: Hive
  Issue Type: Task
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 I can not cope with hive's build infrastructure any more. I have started 
 working on porting the project to maven. When I have some solid progess i 
 will github the entire thing for review. Then we can talk about switching the 
 project somehow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5138) Web HCat API for Streaming

2013-08-22 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5138:
-

 Summary: Web HCat  API  for Streaming
 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5139) Streaming - DDL support for enabling and disabling streaming

2013-08-22 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5139:
-

 Summary: Streaming - DDL support for enabling and disabling 
streaming
 Key: HIVE-5139
 URL: https://issues.apache.org/jira/browse/HIVE-5139
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5139) Streaming - DDL support for enabling and disabling streaming

2013-08-22 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5139:
--

Assignee: Roshan Naik

 Streaming - DDL support for enabling and disabling streaming
 

 Key: HIVE-5139
 URL: https://issues.apache.org/jira/browse/HIVE-5139
 Project: Hive
  Issue Type: Sub-task
  Components: Database/Schema, HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ddl, streaming



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5139) Streaming - DDL support for enabling and disabling streaming

2013-08-22 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5139:
--

Labels: ddl streaming  (was: )

 Streaming - DDL support for enabling and disabling streaming
 

 Key: HIVE-5139
 URL: https://issues.apache.org/jira/browse/HIVE-5139
 Project: Hive
  Issue Type: Sub-task
  Components: Database/Schema, HCatalog
Reporter: Roshan Naik
  Labels: ddl, streaming



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5140) Streaming - Active agent for rolling a streaming partition into a standard partition

2013-08-22 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5140:
-

 Summary: Streaming - Active agent for rolling a streaming 
partition into a standard partition
 Key: HIVE-5140
 URL: https://issues.apache.org/jira/browse/HIVE-5140
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik


The task is to implement an entity which rolls all the committed transactions 
from the streaming partition into a new standard partition atomically.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5138) Streaming- Web HCat API

2013-08-22 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5138:
--

Summary: Streaming- Web HCat  API  (was: Web HCat  API  for Streaming)

 Streaming- Web HCat  API
 

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5139) Streaming - DDL support for enabling and disabling streaming

2013-08-22 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748149#comment-13748149
 ] 

Roshan Naik commented on HIVE-5139:
---

Task is to implement support for enabling and disabling streaming functionality 
on a Hive table via DDL.

 Streaming - DDL support for enabling and disabling streaming
 

 Key: HIVE-5139
 URL: https://issues.apache.org/jira/browse/HIVE-5139
 Project: Hive
  Issue Type: Sub-task
  Components: Database/Schema, HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik
  Labels: ddl, streaming



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming- Web HCat API

2013-08-22 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748165#comment-13748165
 ] 

Roshan Naik commented on HIVE-5138:
---

Implement Webhcat API to: 


1) Enable and Disable streaming on a table

2) Check streaming status

3) Transaction Support:
 -  Get a Chunk File
 -  Commit a Chunk File
 -  Abort the chunk

4) Roll Partition: To roll the committed chunks from streaming partition to a 
new standard partition

 Streaming- Web HCat  API
 

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5138) Streaming - Web HCat API

2013-08-22 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5138:
--

Summary: Streaming - Web HCat  API  (was: Streaming- Web HCat  API)

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-5142) Streaming - Query committed chunks

2013-08-22 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5142:
--

Assignee: Roshan Naik

 Streaming - Query committed chunks
 --

 Key: HIVE-5142
 URL: https://issues.apache.org/jira/browse/HIVE-5142
 Project: Hive
  Issue Type: Sub-task
  Components: Database/Schema, HCatalog
Reporter: Roshan Naik
Assignee: Roshan Naik

 Task is to enable queries to read through the chunks committed into the 
 streaming partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5142) Streaming - Query committed chunks

2013-08-22 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5142:
-

 Summary: Streaming - Query committed chunks
 Key: HIVE-5142
 URL: https://issues.apache.org/jira/browse/HIVE-5142
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik


Task is to enable queries to read through the chunks committed into the 
streaming partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-5143) Streaming - Compaction of partitions

2013-08-22 Thread Roshan Naik (JIRA)
Roshan Naik created HIVE-5143:
-

 Summary: Streaming - Compaction of partitions
 Key: HIVE-5143
 URL: https://issues.apache.org/jira/browse/HIVE-5143
 Project: Hive
  Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik


Task is to support compaction of partitions.

Rationale: Streaming partitions are composed of a large number of small files 
(each commit is one file). Since compaction can be a potentially expensive 
operation (for e.g. converting to single ORC file), we do not compact the 
streaming partition at the time of rolling it into a standard partition. This 
allows rolling to be quick and atomic.

Compaction will be performed at a later time. The streaming partition is 
converted as is (typically with a many small files) into a standard partition. 
This new standard partition will be queued up for compaction by a separate job.

This decouples the compaction feature from streaming support, and makes it more 
generally available for any partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive

2013-05-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-4196:
--

Attachment: HCatalogStreamingIngestFunctionalSpecificationandDesign- apr 
29- patch1.pdf

pdf version of design  spec doc

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.pdf, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive

2013-04-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-4196:
--

Attachment: HIVE-4196.v1.patch

draft patch for review. based on phase mentioned in design doc.   Deviates 
slighlty... 
1) adds a couple of (temporary) rest calls to enable/disable streaming on a 
table. Later these will be replaced with support in DDL. 

2) Also also HTTP methods are GET for easy testing with web browser

3) Authentication disabled on the new streaming HTTP methods


Usage Examples on db named 'sdb'  table named 'log' :

1) *Setup db  table with single partition column 'date':*
 hcat -e create database sdb; use sdb; create table log(msg string, region 
string) partitioned by (date string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY 
',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; 


2) *To check streaming status:*
 http://localhost:50111/templeton/v1/streaming/status?database=sdbtable=log

3) *Enable Streaming:*
 
http://localhost:50111/templeton/v1/streaming/enable?database=sdbtable=logcol=datevalue=1000

4) *Get Chunk File to write to:*
http://localhost:50111/templeton/v1/streaming/chunkget?database=sdbtable=logschema=blahformat=blahrecord_separator=blahfield_separator=blah

5) *Commit Chunk File:*
http://localhost:50111/templeton/v1/streaming/chunkcommit?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/2

6) *Abort Chunk File:*
http://localhost:50111/templeton/v1/streaming/chunkabort?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/3


7) *Roll Partition:*
http://localhost:50111/templeton/v1/streaming/partitionroll?database=sdbtable=logpartition_column=datepartition_value=3000

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 
 HCatalogStreamingIngestFunctionalSpecificationandDesign.docx, 
 HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive

2013-04-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-4196:
--

Attachment: (was: 
HCatalogStreamingIngestFunctionalSpecificationandDesign.docx)

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive

2013-04-29 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-4196:
--

Attachment: HCatalogStreamingIngestFunctionalSpecificationandDesign- apr 
29- patch1.docx

 Support for Streaming Partitions in Hive
 

 Key: HIVE-4196
 URL: https://issues.apache.org/jira/browse/HIVE-4196
 Project: Hive
  Issue Type: New Feature
  Components: Database/Schema, HCatalog
Affects Versions: 0.10.1
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- 
 apr 29- patch1.docx, HIVE-4196.v1.patch


 Motivation: Allow Hive users to immediately query data streaming in through 
 clients such as Flume.
 Currently Hive partitions must be created after all the data for the 
 partition is available. Thereafter, data in the partitions is considered 
 immutable. 
 This proposal introduces the notion of a streaming partition into which new 
 files an be committed periodically and made available for queries before the 
 partition is closed and converted into a standard partition.
 The admin enables streaming partition on a table using DDL. He provides the 
 following pieces of information:
 - Name of the partition in the table on which streaming is enabled
 - Frequency at which the streaming partition should be closed and converted 
 into a standard partition.
 Tables with streaming partition enabled will be partitioned by one and only 
 one column. It is assumed that this column will contain a timestamp.
 Closing the current streaming partition converts it into a standard 
 partition. Based on the specified frequency, the current streaming partition  
 is closed and a new one created for future writes. This is referred to as 
 'rolling the partition'.
 A streaming partition's life cycle is as follows:
  - A new streaming partition is instantiated for writes
  - Streaming clients request (via webhcat) for a HDFS file name into which 
 they can write a chunk of records for a specific table.
  - Streaming clients write a chunk (via webhdfs) to that file and commit 
 it(via webhcat). Committing merely indicates that the chunk has been written 
 completely and ready for serving queries.  
  - When the partition is rolled, all committed chunks are swept into single 
 directory and a standard partition pointing to that directory is created. The 
 streaming partition is closed and new streaming partition is created. Rolling 
 the partition is atomic. Streaming clients are agnostic of partition rolling. 
  
  - Hive queries will be able to query the partition that is currently open 
 for streaming. only committed chunks will be visible. read consistency will 
 be ensured so that repeated reads of the same partition will be idempotent 
 for the lifespan of the query.
 Partition rolling requires an active agent/thread running to check when it is 
 time to roll and trigger the roll. This could be either be achieved by using 
 an external agent such as Oozie (preferably) or an internal agent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4300) ant thriftif generated code that is checkedin is not up-to-date

2013-04-22 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13638318#comment-13638318
 ] 

Roshan Naik commented on HIVE-4300:
---

Namit,
 Those files are no longer part of  patch v2.
 The files you point out got updated by the HIVE-4322 which got committed 
first. 


 ant thriftif  generated code that is checkedin is not up-to-date
 

 Key: HIVE-4300
 URL: https://issues.apache.org/jira/browse/HIVE-4300
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.10.0
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4300.2.patch, HIVE-4300.patch


 running 'ant thriftif -Dthrift.home=/usr/local'  on a freshly checkedout 
 trunk should be a no-op as per 
 [instructions|https://cwiki.apache.org/Hive/howtocontribute.html#HowToContribute-GeneratingThriftCode]
 However this is not the case. Some of files seem to be have been relocated or 
 the classes in them are now in a different file.
 Below is the git status showing the state after the command is run:
 # On branch trunk
 # Changes not staged for commit:
 #   (use git add/rm file... to update what will be committed)
 #   (use git checkout -- file... to discard changes in working directory)
 #
 # modified:   build.properties
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/EnvironmentContext.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Index.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalPrivilegeSet.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Schema.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SerDeInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 # deleted:metastore/src/gen/thrift/gen-php/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_types.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore_constants.php
 # deleted:metastore/src/gen/thrift/gen-php/hive_metastore_types.php
 # modified:   
 metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 # deleted:ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/InnerStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/ThriftTestObj.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/IntString.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MiniStruct.java
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_constants.php
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_types.php
 # deleted:service/src/gen/thrift/gen-php/hive_service/ThriftHive.php
 # deleted:
 service/src/gen/thrift/gen-php/hive_service/hive_service_types.php
 # modified:   service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote
 # modified:   service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote
 #
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 # serde/src/gen/thrift/gen-cpp/complex_constants.cpp
 # serde/src/gen/thrift/gen-cpp/complex_constants.h
 # serde/src/gen/thrift/gen-cpp/complex_types.cpp
 # serde/src/gen/thrift/gen-cpp/complex_types.h
 # serde/src/gen/thrift/gen-cpp/megastruct_constants.cpp
 # 

[jira] [Commented] (HIVE-4300) ant thriftif generated code that is checkedin is not up-to-date

2013-04-22 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13638750#comment-13638750
 ] 

Roshan Naik commented on HIVE-4300:
---

just that much. 

 ant thriftif  generated code that is checkedin is not up-to-date
 

 Key: HIVE-4300
 URL: https://issues.apache.org/jira/browse/HIVE-4300
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.10.0
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4300.2.patch, HIVE-4300.patch


 running 'ant thriftif -Dthrift.home=/usr/local'  on a freshly checkedout 
 trunk should be a no-op as per 
 [instructions|https://cwiki.apache.org/Hive/howtocontribute.html#HowToContribute-GeneratingThriftCode]
 However this is not the case. Some of files seem to be have been relocated or 
 the classes in them are now in a different file.
 Below is the git status showing the state after the command is run:
 # On branch trunk
 # Changes not staged for commit:
 #   (use git add/rm file... to update what will be committed)
 #   (use git checkout -- file... to discard changes in working directory)
 #
 # modified:   build.properties
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/EnvironmentContext.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Index.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalPrivilegeSet.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Schema.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SerDeInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 # deleted:metastore/src/gen/thrift/gen-php/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_types.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore_constants.php
 # deleted:metastore/src/gen/thrift/gen-php/hive_metastore_types.php
 # modified:   
 metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 # deleted:ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/InnerStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/ThriftTestObj.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/IntString.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MiniStruct.java
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_constants.php
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_types.php
 # deleted:service/src/gen/thrift/gen-php/hive_service/ThriftHive.php
 # deleted:
 service/src/gen/thrift/gen-php/hive_service/hive_service_types.php
 # modified:   service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote
 # modified:   service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote
 #
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 # serde/src/gen/thrift/gen-cpp/complex_constants.cpp
 # serde/src/gen/thrift/gen-cpp/complex_constants.h
 # serde/src/gen/thrift/gen-cpp/complex_types.cpp
 # serde/src/gen/thrift/gen-cpp/complex_types.h
 # serde/src/gen/thrift/gen-cpp/megastruct_constants.cpp
 # serde/src/gen/thrift/gen-cpp/megastruct_constants.h
 # serde/src/gen/thrift/gen-cpp/megastruct_types.cpp
 # 

[jira] [Commented] (HIVE-4300) ant thriftif generated code that is checkedin is not up-to-date

2013-04-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633119#comment-13633119
 ] 

Roshan Naik commented on HIVE-4300:
---

FYI.. HIVE-4322 makes manual changes to auto generated code. This will be a 
maintenance headache. I have incorporated those changes into this as part of 
the rebasing in patch v2.


 ant thriftif  generated code that is checkedin is not up-to-date
 

 Key: HIVE-4300
 URL: https://issues.apache.org/jira/browse/HIVE-4300
 Project: Hive
  Issue Type: Bug
  Components: Thrift API
Affects Versions: 0.10.0
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4300.2.patch, HIVE-4300.patch


 running 'ant thriftif -Dthrift.home=/usr/local'  on a freshly checkedout 
 trunk should be a no-op as per 
 [instructions|https://cwiki.apache.org/Hive/howtocontribute.html#HowToContribute-GeneratingThriftCode]
 However this is not the case. Some of files seem to be have been relocated or 
 the classes in them are now in a different file.
 Below is the git status showing the state after the command is run:
 # On branch trunk
 # Changes not staged for commit:
 #   (use git add/rm file... to update what will be committed)
 #   (use git checkout -- file... to discard changes in working directory)
 #
 # modified:   build.properties
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/EnvironmentContext.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Index.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalPrivilegeSet.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Schema.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SerDeInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 # modified:   
 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 # deleted:metastore/src/gen/thrift/gen-php/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_types.php
 # deleted:
 metastore/src/gen/thrift/gen-php/hive_metastore_constants.php
 # deleted:metastore/src/gen/thrift/gen-php/hive_metastore_types.php
 # modified:   
 metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 # deleted:ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/InnerStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/test/ThriftTestObj.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/IntString.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java
 # modified:   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MiniStruct.java
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_constants.php
 # deleted:serde/src/gen/thrift/gen-php/serde/serde_types.php
 # deleted:service/src/gen/thrift/gen-php/hive_service/ThriftHive.php
 # deleted:
 service/src/gen/thrift/gen-php/hive_service/hive_service_types.php
 # modified:   service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote
 # modified:   service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote
 #
 # Untracked files:
 #   (use git add file... to include in what will be committed)
 #
 # serde/src/gen/thrift/gen-cpp/complex_constants.cpp
 # serde/src/gen/thrift/gen-cpp/complex_constants.h
 # serde/src/gen/thrift/gen-cpp/complex_types.cpp
 # serde/src/gen/thrift/gen-cpp/complex_types.h
 # 

  1   2   >