[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776946#comment-13776946
 ] 

Roshan Naik commented on HIVE-5138:
---

Capturing API related comments from [~ashutoshc] noted 
[here|https://issues.apache.org/jira/browse/HIVE-4196?focusedCommentId=13770314page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13770314]
 in HIVE-4196

{quote}
We should try to eliminate the need of intermediate staging area while rolling 
on new partitions. Seems like there should not be any gotchas while moving data 
from streaming dir to partition dir directly.
We should make thrift apis in metastore forward compatible. One way to do that 
is to use struct (which contains all parameters) instead of passing in list of 
arguments.
We should try to leave TBLS table untouched in backend db. That will simplify 
upgrade story. One way to do that is to have all new columns in a new table and 
than add constraints for this new table.
{quote}

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch, HIVE-5138.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-24 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776954#comment-13776954
 ] 

Roshan Naik commented on HIVE-5138:
---

bq.  We should try to eliminate the need of intermediate staging area while 
rolling on new partitions. Seems like there should not be any gotchas while 
moving data from streaming dir to partition dir directly.

Thanks. That change is already part of the patch.

bq. We should make thrift apis in metastore forward compatible. One way to do 
that is to use struct (which contains all parameters) instead of passing in 
list of arguments.

Hate it .. but Ok. :-)


bq. We should try to leave TBLS table untouched in backend db. 
Sure. Will move them to a new table.



 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch, HIVE-5138.v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-17 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769784#comment-13769784
 ] 

Eugene Koifman commented on HIVE-5138:
--

OK, makes sense.  It would be useful to add some javadoc about concurrency (or 
rather why it's not an issue)

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-16 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769209#comment-13769209
 ] 

Roshan Naik commented on HIVE-5138:
---

Thanks [~ekoifman] for the comments:

h5. On Pt 1.
 Thanks. I need to take a closer look at this.


h5. On Pt 2.
 I think you mean 'safe to invoke concurrently' instead of 'atomic', since the 
intermediate states are going to be visible when an operation spans both file 
system and meta store. Here is a summary of the reasons why each operation is 
concurrency safe:

 - *streamingStatus* : Readonly metastore operation
 - *chunkGet* : This is an atomic metastore operation 
 - *chunkAbort* : Just deletes a file. So no concurrency issues here.
 - *chunkCommit* : Just renames a file. So only one of concurrent operations 
will succeed.
 - *disableStreaming* : This is an atomic metastore operation 
 - *enableStreaming* : Does a couple of mkdirs (for setup) followed by an 
atomic metastore operation. mkdirs() is idempotent, so all concurrent calls 
succeed. All concurrent invocations enter a transaction to do the metastore 
update atomically...only one should update metastore.

 - *partitionRoll* : Creates empty dir for the new current partition  then 
atomically updates metastore as follows:
   -# Make note of this new current partition dir
   -# Do an addPartition() on the previous current partition. 

- If concurrent partitionRoll() invocations use same arguments, the 
addPartition() step will fail on all but one. If arguments are not same in 
concurrent invocations, they all succeed and updates made by the last 
invocation to exit the metastore transaction would override the others.




 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, WebHCat
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762146#comment-13762146
 ] 

Eugene Koifman commented on HIVE-5138:
--

[~roshan_naik] A couple of comments on this patch:
1. All delegators in WebHCat take the 'user' as determined by Server.java and 
use that to make secure calls to JobTrakcer, HDFS etc.  HCatStreamingDelegator 
ignores it.  Why is that?
2. Most operations in HCatStreamingDelegator do multiple things (like modify 
metadata, create some HDFS file, etc.).  It sounds like every one of these 
operations should be atomic.  For example, say for some reason 2 identical 
calls to partitionRoll() happen at the same time.  How is this atomicity 
achieved?

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming - Web HCat API

2013-09-06 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760863#comment-13760863
 ] 

Roshan Naik commented on HIVE-5138:
---

Patch v2 addresses the review comments from 
https://issues.apache.org/jira/browse/HIVE-4196?focusedCommentId=13714235page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13714235

 Streaming - Web HCat  API
 -

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: HIVE-4196.v2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-5138) Streaming- Web HCat API

2013-08-22 Thread Roshan Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748165#comment-13748165
 ] 

Roshan Naik commented on HIVE-5138:
---

Implement Webhcat API to: 


1) Enable and Disable streaming on a table

2) Check streaming status

3) Transaction Support:
 -  Get a Chunk File
 -  Commit a Chunk File
 -  Abort the chunk

4) Roll Partition: To roll the committed chunks from streaming partition to a 
new standard partition

 Streaming- Web HCat  API
 

 Key: HIVE-5138
 URL: https://issues.apache.org/jira/browse/HIVE-5138
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore
Reporter: Roshan Naik
Assignee: Roshan Naik



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira