[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-16 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741980#comment-13741980
 ] 

Jaideep Dhok commented on HIVE-4569:


I have put up a patch on the work done so far. In this patch, ExecuteStatement 
and ExecutestatementAsync are two separate calls.

This also has GetQueryPlan.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch, 
 HIVE-4569.D12333.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-16 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742335#comment-13742335
 ] 

Henry Robinson commented on HIVE-4569:
--

As an alternative suggestion, what about considering a 
{{WaitUntilComplete(TOperationStatus)}} call? The benefit would be that there 
was immediately a way to block on the result of every operation (rather than 
adding {{*Async}} APIs to the interface and doubling its size). Then 
{{executeStatement}} doesn't need to change its documented semantics, and Hive 
can immediately be compatible by making {{WaitUntilComplete}} a no-op until 
asynchronous support is completely ready.

I also agree that it might be worth splitting this discussion into a separate 
JIRA.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch, 
 HIVE-4569.D12333.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-16 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742735#comment-13742735
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~ashutoshc][~henryr][~jaid...@research.iiit.ac.in] [~thejas] I definitely 
think execute async is quite ready and would be a good idea to have that in, 
while we discuss concerns on GetQueryPlan/TaskStatus. Without splitting, it 
might be kind of hard to focus on each. While reviewing this patch, I was 
actually trying to group the changes in two sets - I have a document which kind 
of summarizes the changes of each group (1. ExecuteAsync 2. GetQueryPlan + 
TaskStatus). I can upload that if you guys find use for it (if we decide on 
splitting, we can use it to see what we want in each JIRA).

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch, 
 HIVE-4569.D12333.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740760#comment-13740760
 ] 

Thejas M Nair commented on HIVE-4569:
-

bq. I will put up a new request, and keep updating it if there are further 
comments? 
Sounds good. Looking forward to it. And thanks for working on this!

Regarding [~vaibhavgumashta]'s comment about GetQueryPlan backward 
compatibility. We need to examine what guarantees can be given regarding 
backward compatibility of the json string queryplan. Is the thrift json 
structure stable if used by generic json parsers ? I think we should at least 
state that the operator types and stage types can change across versions.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740787#comment-13740787
 ] 

Thejas M Nair commented on HIVE-4569:
-

Here are some comments about the async execution api changes from a jdbc/odbc 
driver implementation perspective -

h3. jdbc/odbc requirements:

I think the asynchronous execution api is going to be very useful for jdbc/odbc 
as well. For a long running query there are higher chances of interruptions in 
the network connection to HS2. This is specially true for HS2 over http 
(HIVE-4763), where it might pass through http proxy servers.

The downside of the async call is that the *dbc client moves to a pull model 
instead of what was like a push equivalent. It will have to poll with some 
sleep in between the poll requests to avoid too much load on the server. But 
this sleep can cause delays in getting notified when the execution is finished. 
So it will be useful to have support for long poll in such a case to simulate a 
push (http://en.wikipedia.org/wiki/Push_technology#Long_polling).

So that clients can tell the server that it is actually interested in doing a 
long poll, we need support for it in the HS2 api.

Another difference for jdbc/odbc requirement from GetOperationStatus api is 
that it won't make use of the status of each task. Only the completion of the 
query execution matters from jdbc/odbc perspective.
So for odbc/jdbc the long poll should return before a 'long poll timeout' only 
if the query has completed.

h3. Question about api:
While the actual implementation of long poll can be in a different followup 
jira, I thought it will be useful to discuss if this should have an impact on 
the async api changes.
How should we meet this odbc/jdbc need ? If we follow the pattern we have 
followed with async execute, this would result in a new 
GetOperationStatusLongPoll call.

It doesn't look like this requirement will have impact on changes planned in 
this jira, but I just wanted to put my thoughts out incase there were other 
opinions.



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740809#comment-13740809
 ] 

Carl Steinbach commented on HIVE-4569:
--

[~jaideepdhok] If I call GetQueryPlan for a statement x, and then subsequently 
call ExecuteStatement on the same statement, is it guaranteed that 
ExecuteStatement will always use the same plan that was returned earlier by 
GetQueryPlan? The names of the functions seem to imply this, but the comments 
in TCLIService.thrift don't stipulate that ExecuteStatement will use the plan 
generated by the previous GetQueryPlan call instead of recompiling the 
statement and possibly creating a different plan. Adding a PrepareStatement 
call (e.g. PrepareStatement[, GetQueryPlan], ExecuteStatement) is one way of 
resolving this ambiguity, and at the same time it will help to maintain the 
close alignment between the HS2 API and ODBC/JDBC.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740808#comment-13740808
 ] 

Carl Steinbach commented on HIVE-4569:
--

bq. I think we should at least state that the operator types and stage types 
can change across versions.

Good luck with that. As soon as you have a couple third-party applications that 
depend on this serialization format you will be locked in regardless of how 
many warnings you place in the code.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740811#comment-13740811
 ] 

Carl Steinbach commented on HIVE-4569:
--

bq. Thejas M Nair I think making executeStatement async by default may break 
users' expectations since it's a blocking call. Carl Steinbach Had suggested 
earlier to create two separate calls executeStatement and executeStatementAsync 
so that the API is easier to understand. I agree with that approach. If we have 
two different calls, then users can pick one based on their need.

It's possible to overload ExecuteStatement to support both synchronous and 
asynchronous modes without breaking backward compatibility by adding an 
optional boolean isAsync flag to the request message and setting the default 
value to false. Whether or not this makes more sense than the current approach 
hinges largely on how many more optional variables we expect to add to the 
ExecuteStatement[Async] request messages in the future. If we have two 
functions then we'll need to make the same changes in two different places.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741259#comment-13741259
 ] 

Thejas M Nair commented on HIVE-4569:
-

bq. Good luck with that. As soon as you have a couple third-party applications 
that depend on this serialization format you will be locked in regardless of 
how many warnings you place in the code.
Yes, I agree that risk is very real. Do we want to put these commitments on the 
still young hive ? Trying to keep this api backward compatible can be a big 
burden for hive. Should we go for something more minimalistic instead ? Just a 
compile() function instead of getQueryPlan() like what was put forward in 
HIVE-4321 ?

bq. It's possible to overload ExecuteStatement to support both synchronous and 
asynchronous modes without breaking backward compatibility by adding an 
optional boolean isAsync flag to the request message and setting the default 
value to false.
I am ok with having different functions for this. But I think function 
overloading is a more natural way of doing this. Deciding whether it should be 
async or not based on a parameter seems more natural way of programming, 
compared to using different functions for that. We can either have one function 
with default value or have two with same name. ie, Instead of 
ExecuteStatementAsync, I think having a ExecuteStatement with additional 
isAsync parameter is more clean. 


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741298#comment-13741298
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~cwsteinbach] [~thejas] With respect to overloading ExecuteStatement, I think 
the previous patch by [~jaid...@research.iiit.ac.in] was probably doing that. 
But there was a suggestion in the rb that overloading ExecuteStatement in the 
thrift API may not correspond to overloading in CLIService/ICLIService. Are you 
suggesting that the thrift api has just ExecuteStatement, and based on whether 
the async flag is set to true/false in the corresponding TExecuteStatementReq, 
we branch off to using ICLIService#executeStatementAsync or 
ICLIService#executeStatement?

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741324#comment-13741324
 ] 

Thejas M Nair commented on HIVE-4569:
-

bq. Whether or not this makes more sense than the current approach hinges 
largely on how many more optional variables we expect to add to the 
ExecuteStatement[Async] request messages in the future.
I think it is reasonable to expect the unexpected, ie expect more optional 
parameters coming up in future.

This is what I am thinking. [~cwsteinbach] Please let me know if you think this 
is reasonable. 
1. In TCLDriver.thrift, as in the original patch, add optional bool runAsync 
to TExecuteStatementReq 
2. In ICLIService (and its implementation CLIService), introduce a executeAsync 
function that gets called if runAsync==true.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741335#comment-13741335
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~thejas] Only addition I would make is setting runAsync to false by default in 
TExecuteStatementReq.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741335#comment-13741335
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~thejas] Only addition I would make is setting runAsync to false by default in 
TExecuteStatementReq.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741823#comment-13741823
 ] 

Jaideep Dhok commented on HIVE-4569:


bq. [~vgumashta] What could be the use case for returning the query plan? And 
how will it be consumed by the client? Making it public means that any change 
to the query plan in future will break the consumer code.
It was outlined in the HS2 spec, but not implemented. Having a query plan is 
useful for tracking query progress. We have another use case where we want to 
access query plan through code, but currently there's no way to do that.

If you want to guard against changes to query plan code, then plan object needs 
to be declared at the thrift layer, and implementation has to convert between 
internal query plan (ql layer) to thrift query plan (and vice versa), like it 
is being done for data types and operation states.

bq. [~thejas] Is the thrift json structure stable if used by generic json 
parsers ?  I think we should at least state that the operator types and stage 
types can change across versions.

You need the Thrift JSON parsers to encode/decode the JSON query plan into the 
corresponding Java object.

bq. [~cwsteinbach] If I call GetQueryPlan for a statement x, and then 
subsequently call ExecuteStatement on the same statement, is it guaranteed that 
ExecuteStatement will always use the same plan that was returned earlier by 
GetQueryPlan? 
Yes, unless configuration was altered between the two calls through SET 
operations, or the conf overlay is different.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741828#comment-13741828
 ] 

Jaideep Dhok commented on HIVE-4569:


[~thejas] I think having two calls with different names, ExecuteStatement and 
ExecuteStatementAsync will be less confusing for the user.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741952#comment-13741952
 ] 

Ashutosh Chauhan commented on HIVE-4569:


Seems like there is a general consensus that async execute statement is a good 
idea. So, lets unblock it and get that part of the patch in. In the meanwhile 
we can continue to discuss the way to add getQueryPlan. [~jaideepdhok] I 
understand we have went back n forth on doing these two issues in one patch Vs 
multiple, but looks like thats a good way to make progress. If you agree, can 
you put up a patch containing async execute statement.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739276#comment-13739276
 ] 

Jaideep Dhok commented on HIVE-4569:


[~vgumashta] Initially it was split into three JIRAs, but other people 
suggested that it would be easier to track progress in a single JIRA.

I've completed most of the changes, and have updated based on last review by 
[~cwsteinbach]

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739279#comment-13739279
 ] 

Jaideep Dhok commented on HIVE-4569:


Sorry for the duplicate review request. Please refer to the last one.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739334#comment-13739334
 ] 

Thejas M Nair commented on HIVE-4569:
-

[~jaideepdhok] The patch on phabricator links look incomplete, for example it 
is missing service/if/TCLIService.thrift. Can you update the patch in the 
phabricator link with original review comments 
(https://reviews.facebook.net/D11469) ? That way it is easier to track changes 
across patches.
Having a new phabricator link for each patch iteration makes it difficult to 
follow the changes between patches.



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739363#comment-13739363
 ] 

Thejas M Nair commented on HIVE-4569:
-

[~jaideepdhok] [~cwsteinbach] Should we keep the api simple (small) by just 
making the current execute function asynchronous instead of adding an 
additional execute function in the api ? I think [~henryr] has a good point 
that it was always documented to be asynchronous (it just happened that it 
always was so late in returning the call that the operation was finished :) ).

Also, I think it makes sense to make the GetResultSetMetadata and FetchResults 
api blocking until operation finishes, instead of throwing an error if status 
is not FINISHED. This will also help to prevent breakage of any user code that 
was written with the assumption that execute is blocking.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739385#comment-13739385
 ] 

Jaideep Dhok commented on HIVE-4569:


bq. Having a new phabricator link for each patch iteration makes it difficult 
to follow the changes between patches.
[~thejas]
Looks like the changes got split into two requests.
Unfortunately I am unable to update the previous revision, as I had lost the 
previous arc commit. I will put up a new  request, and keep updating it if 
there are further comments?

bq.  Should we keep the api simple (small) by just making the current execute 
function asynchronous instead of adding an additional execute function in the 
api ?

[~thejas] I think making executeStatement async by default may break users' 
expectations since it's a blocking call. [~cwsteinbach] Had suggested earlier 
to create two separate calls executeStatement and executeStatementAsync so that 
the API is easier to understand. I agree with that approach. If we have two 
different calls, then users can pick one based on their need.

For getting result set in case of async the flow would be - 
ExecuteStatementAsync, GetOperationStatus (until query completes), then fetch 
result set. 

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739420#comment-13739420
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~thejas] I think you mean by making GetResultSetMetadata and FetchResults API 
blocking, we can change the executeStatement to async by default but at the 
same time not break any user code? 

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739435#comment-13739435
 ] 

Amareshwari Sriramadasu commented on HIVE-4569:
---

I think it makes sense to have two apis as JDBC drivers can call one with sync 
and other users interested in async can call async api. Though the 
documentation of execute() has to be changed to say that it is executed 
synchronously.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-14 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740420#comment-13740420
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~jaid...@research.iiit.ac.in] [~amareshwari] I have some concern regarding the 
GetQueryPlan api that we are exposing regarding backward compatibility. What 
could be the use case for returning the query plan? And how will it be consumed 
by the client? Making it public means that any change to the query plan in 
future will break the consumer code. 

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-13 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738575#comment-13738575
 ] 

Vaibhav Gumashta commented on HIVE-4569:


It seems that this JIRA is handling two different use cases: 

1. Implement ExecuteStatement asynchronously (and the related 
GetOperationStatus api)
2. Implement GetQueryPlan api.

I see that these are fairly independent features. How about we split it into 2 
JIRAS to have independent focussed discussion?

Also, I can volunteer to continue the work.
Thanks. 
  

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-12 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736885#comment-13736885
 ] 

Henry Robinson commented on HIVE-4569:
--

Although {{executeStatement}} is implemented synchronously in Hive, was it 
meant to be synchronous from the outset? The comment in the Thrift definition 
suggests otherwise:

{code}
// ExecuteStatement()
//
// Execute a statement.
// The returned OperationHandle can be used to check on the
// status of the statement, and to fetch results once the
// statement has finished executing.
{code}



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-07-03 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699773#comment-13699773
 ] 

Jaideep Dhok commented on HIVE-4569:


{quote}
Thrift makes it easy to add additional optional parameters without breaking 
backward compatibility, but not Java. I'd recommend creating a new 
executeStatementAsync call to ICLIService (and here) instead of modifying the 
method signature. Also, that probably indicates that we should add a new 
complimentary RPC to the HS2 Thrift IDL instead of using adding an optional 
parameter to ExecuteStatement just to keep these things in sync.
{quote}

Do we need explicit new request and response objects for both executeStatement 
and executeStatementAsync calls? I think the same request call should do?

Also, I found in the code that conf overlay is not actually being applied 
before executing an operation. I suppose there should be another JIRA for that.



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-07-03 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699775#comment-13699775
 ] 

Jaideep Dhok commented on HIVE-4569:


bq. I think the same request call should do?
Sorry, I meant the same request and response objects used in ExecuteStatement 
at the moment.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-28 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695920#comment-13695920
 ] 

Phabricator commented on HIVE-4569:
---

cwsteinbach has commented on the revision HIVE-4569 [jira] GetQueryPlan api in 
Hive Server2.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java:1 Missing ASF license 
header.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java:1019 This looks 
like a debug statement. Should it be removed?
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java:95 Can you add some 
comments here explaining what each one of these states actually means? Also, do 
we need an UNKNOWN state? I included one in the Thrift IDL OperationState, but 
in retrospect that was probably a mistake.
  service/if/TCLIService.thrift:34 As discussed earlier we shouldn't add this 
dependency to the HS2 API. Please remove it and return the Task information in 
JSON or XML.
  service/if/TCLIService.thrift:41 We need to bump the version number since 
this patch extends the HS2 API with new functionality. Can you also please add 
a comment here briefly summarize what was added in the new version?
  service/if/TCLIService.thrift:594 Thrift allows you specify default values 
for optional fields. I think we should set this value to 'false' by default.
  service/if/TCLIService.thrift:866 Just want to double-check that TTaskState 
and TTaskStatus will be removed since the plan state will be serialized as JSON 
or XML, right?
  service/if/TCLIService.thrift:1003 Where is TGetQueryPlanReq? The comments at 
the top stipulate that every RPC has it's own req/resp message pair.
  service/if/TCLIService.thrift:1006 Just double-checking that this will be 
changed to a string.
  service/if/TCLIService.thrift:1043 Please don't overload TExecuteStatementReq.
  service/src/java/org/apache/hive/service/cli/CLIService.java:149 Thrift makes 
it easy to add additional optional parameters without breaking backward 
compatibility, but not Java. I'd recommend creating a new executeStatementAsync 
call to ICLIService (and here) instead of modifying the method signature. Also, 
that probably indicates that we should add a new complimentary RPC to the HS2 
Thrift IDL instead of using adding an optional parameter to ExecuteStatement 
just to keep these things in sync.
  service/src/java/org/apache/hive/service/cli/CLIService.java:318 
s/:getQueryPlan/: getQueryPlan/
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java:367 I don't think this 
method is thread-safe. I recommend replacing the four boolean state variables 
(started, initialized, isdone, queued, wth??) with the single TaskState enum 
you added and make sure that all access to this state variable is synchronized.

REVISION DETAIL
  https://reviews.facebook.net/D11469

To: JIRA, jaideepdhok
Cc: cwsteinbach


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-28 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695923#comment-13695923
 ] 

Carl Steinbach commented on HIVE-4569:
--

[~jaideepdhok] I made it through half the patch and left comments on 
phabricator. I'll aim to get through the rest sometime this weekend. Sorry for 
the delay. Also, I just wanted to say thanks for tackling this problem. Support 
for async execution was a big hole in the API and I'm excited that it's going 
to be fixed soon.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-27 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694605#comment-13694605
 ] 

Jaideep Dhok commented on HIVE-4569:


[~cwsteinbach] Do you have any comments on the rest of the patch?

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691935#comment-13691935
 ] 

Carl Steinbach commented on HIVE-4569:
--

@Jaideep: Thanks for posting an updated patch. I plan to spend some more time 
tonight looking this over closely, but in the meantime I wanted to raise one 
high-level concern. I think the HS2 Thrift API should be as self-contained as 
possible. In particular I don't think it's a good idea to inherit functionality 
from any of the quasi-public Thrift APIs that already exist (e.g. 
queryplan.thrift) for the following reasons:

* We version the HS2 Thrift API in order to maintain backward compatibility 
with older clients, and I'm worried that people will forget to bump the version 
number in TCLIService.thrift when they make a change in queryplan.thrift.
* One of the original design goals of HS2 was to decouple the network 
serialization layer from the service layer in the interest of eventually being 
able to easily support multiple different serialization formats (e.g. 
Protobufs, Avro, Thrift, etc). I think depending on queryplan.thrift will make 
it harder to do this.
* At the moment TCLIService.thrift doesn't expose anything that ties it 
directly to Hive, and I'd like to keep it that way. For example, there's no 
reason why we couldn't also embed the Pig language runtime in HS2 and expose it 
through the HS2 API (see the [AccessServer 
proposal|https://cwiki.apache.org/confluence/display/Hive/AccessServer+Design+Proposal]
 for more details). Tying the new QueryPlan RPC to queryplan.thrift will make 
this harder to do.

Instead of depending on queryplan.thrift I'd like to propose that 
TGetQueryPlanResp return a JSON or XML encoded version of the queryplan.


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-24 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692731#comment-13692731
 ] 

Jaideep Dhok commented on HIVE-4569:


[~cwsteinbach] Thanks for the reply. I was not aware of AccessServer. If we 
don't need a dependency on queryplan.thrift, then I guess it would make sense 
to use XML encoding, since there is already code to serialize/deserialize query 
plan to/from XML.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692748#comment-13692748
 ] 

Carl Steinbach commented on HIVE-4569:
--

[~jaideepdhok] Thrift also supports two different types of JSON serialization: 
TJSONProtocol and TSimpleJSONProtcol. I have no preference either way, but I've 
noticed that JSON seems to be more popular than XML these days.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
 HIVE-4569.D11469.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13681858#comment-13681858
 ] 

Thejas M Nair commented on HIVE-4569:
-

bq. Right now I have done this by passing a boolean flag while calling 
executeStatement
[~jaideepdhok] I assume this is going to be an optional field, and that adding 
optional fields to thrift argument would keep the api backwards compatible to 
old clients that don't set this field. Can you please confirm ?


 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-12 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13681905#comment-13681905
 ] 

Jaideep Dhok commented on HIVE-4569:


[~thejas] It's an optional field in the Thrift request object for execute 
statement.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-06-05 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675698#comment-13675698
 ] 

Jaideep Dhok commented on HIVE-4569:


Update on the work done so far -

# h5. Added getQueryPlan API with Thrift
# h5. Added support for non-blocking queries.
## Right now I have done this by passing a boolean flag while calling 
executeStatement
## If the flag is set to true, query runs in non-blocking mode. The flag 
defaults to false.
## I've implemented this by adding a fixed size thread pool in the 
OperationManager, for running non-blocking operations. A reference to the 
future is kept in the operation, so that it can be cancelled.
## Once the query is running in the background, users can poll status using 
GetOperationStatus.
## Users can cancel the query by calling CancelOperation
# h5. Additions in GetOperationStatus
## OperationManager calls operation.getTaskStatuses(), Each operation can 
override this method to customize reporting
## SQLOperation returns the task statuses by calling getTaskStatuses() on the 
current driver.
## Driver reports task statuses by iterating through all tasks in the plan
## Changes in HS2 thrift API -
{code}
// GetOperationStatus()
//
// Get the status of an operation running on the server.
struct TGetOperationStatusReq {
  // Session to run this request against
  1: required TOperationHandle operationHandle
}

// State of a sub task in an operation
enum TTaskState {
  // The task has been initialized
  INITIALIZED_STATE,

  // Driver is currently running the task
  RUNNING_STATE,

  // Task is completed
  FINISHED_STATE,

  // Task is queued in the driver
  QUEUED_STATE,
  
  // State is unkown
  UNKOWN_STATE
}

// Status of a sub task in an operation
struct TTaskStatus {
 // Task ID
 1: required string taskId
 // External ID for this task, For example MapRedTask can return job ID of the 
Hadoop job
 2: optional string externalHandle
 // Current state of the task as seen by driver
 3: required TTaskState state
}

struct TGetOperationStatusResp {
  1: required TStatus status
  // State of the whole operation
  2: optional TOperationState operationState
  // List of statuses of sub tasks
  3: optional listTTaskStatus taskStatuses
}
{code}


h5. Things pending as of now
# If the Task runs in a sub-process, then external handle (job ID) is returned 
as null.




 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-27 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13668072#comment-13668072
 ] 

Jaideep Dhok commented on HIVE-4569:


Work from HIVE-4570 and HIVE-4617 have been moved to this issue.

To restate the scope of the issue, here are the proposed changes:
# Add GetQueryPlan Thrift API. This will return plan object containing Stage 
and Task information for the query. This call will not run the query.
# A way to run query asynchronously so that query progress can be monitored 
without waiting them to complete.
# Extend OperationState struct returned by GetOperationState to include more 
information like job IDs launched for sub-tasks, query progress indicator.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch, HIVE-4569.D10887.1.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-22 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664636#comment-13664636
 ] 

Carl Steinbach commented on HIVE-4569:
--

bq. I do not see GetQueryPlan api available in HiveServer2, though the wiki 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
contains, not sure why it was not added.

It was not added because it became clear during implementation of HiveServer2 
that it was a bad idea to extend (i.e. depend on) any of the existing legacy 
Hive Thrift APIs. We also were narrowly focused on supporting JDBC/ODBC, and 
neither of these APIs provide explicit support for retrieving the execution 
plan.

@Jaideep: I think it would be a good idea to post some notes about how you plan 
to modify the HS2 Thrift API and get feedback before spending time doing the 
implementation work.

 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok

 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-05-22 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664836#comment-13664836
 ] 

Jaideep Dhok commented on HIVE-4569:


@Carl: This change will not affect JDBC/ODBC clients. Currently clients using 
Thrift have no way to get query plan, which is why we wanted to add this.

Here are the changes proposed:


# Add GetQueryPlan with arguments same as ExecuteStatement -
   {code}TGetQueryPlanResp GetQueryPlan(1:TExecuteStatementReq req);{code}
# Run a SQLOperation for the request, calling Driver.compile with the statement 
and return the plan object. Throw HiveSQLException with return code of compile 
if it fails.
# New response type for the above call -
{code}
struct TGetQueryPlanResp {
1: required TStatus status
// Queryplan
2: required queryplan.Query plan
}
{code}


We'll have to include queryplan.thrift in TCLIService.thrift for the return type



 GetQueryPlan api in Hive Server2
 

 Key: HIVE-4569
 URL: https://issues.apache.org/jira/browse/HIVE-4569
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Jaideep Dhok
 Attachments: git-4569.patch


 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
 api available in HiveServer2, though the wiki 
 https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
 contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira