Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-28 Thread Prasad Mujumdar


 On Feb. 18, 2014, 6:41 p.m., Prasad Mujumdar wrote:
  @Vaibhav, thanks for the patch and following up on the overall async 
  execution line items!
  
  The patch itself looks fine on the first pass. I do have a high level 
  question on the approach. The patch enables async execution by defaults on 
  client side and adds synchronous wait top of that. This IHMO defeats the 
  purpose of the aysnc execution. From the application point of view, there's 
  no difference in the behavior with or without this patch. The execution 
  will block till the query execution is complete.
  The rationale of async execution is to return control back to client 
  immediately so that the client has an option to perform alternate 
  foreground work while the query is being processed. Have you considered  
  blocking in fetch for queries with resultset ? That in would give more time 
  for server to process the query in parallel while client examines the 
  resultset description and possible interact with end user etc.
 
 Vaibhav Gumashta wrote:
 @Prasad: Thanks a lot for taking a look! I was basically going for a use 
 case when an error can be detected sooner. For example consider the scenario 
 when a stmt.cancel is called from a separate thread:
 Blocking calls:
 When stmt.execute is called, internally HiveStatement sets a variable 
 stmtHandle, to refer to the Operation object that it created for the query 
 execution. However, for a blocking call, the stmtHandle is set only when the 
 call to Operation.run returns. If we call a stmt.cancel (in a different 
 thread) during this, the internal code will check for the value of stmtHandle 
 and return immediately if it is null, essentially doing a no-op. 
 Async calls:
 Since an async call immediately returns from Operation.run, the 
 stmtHandle is set in HiveStatement while the query is still executing. This 
 means, that a stmt.cancel can actually cancel the underlying operation and 
 the original stmt.execute will throw an error when we poll for the operation 
 status (since the underlying operation handle is gone). This can return a 
 useful response to the client sooner.
 
 However, I agree blocking on fetch would be more efficient. Do you think 
 I can take that up in a follow up jira since I may not have a lot of time to 
 work on it for Hive 13 release?
 
 Thanks again!
 
 Thejas Nair wrote:
 Yes, I think it is ok to address that in a follow up jira. This is an 
 optimization, and the changes are useful even without that. What do you think 
 Prasad ?


Sounds good. 
@Viabhav, are you planning to log a follow up ticket or I can do that. This 
will be especially useful for beeline and normal single thread client. Thanks!


- Prasad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
---


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-28 Thread Vaibhav Gumashta


 On Feb. 23, 2014, 12:36 a.m., Thejas Nair wrote:
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java, line 271
  https://reviews.apache.org/r/14950/diff/3/?file=498672#file498672line271
 
  I think we should retry on exceptions that indicate an network 
  connection error. TProtocolException seems to be the exception that is 
  thrown in such cases.
 
 Thejas Nair wrote:
 I think we can address this in a follow up patch. Connection error in 
 case of the current synchronous execution also would result in failure.


I'll create a new jira for that. Thanks for taking a look!


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review35234
---


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-28 Thread Vaibhav Gumashta


 On Feb. 18, 2014, 6:41 p.m., Prasad Mujumdar wrote:
  @Vaibhav, thanks for the patch and following up on the overall async 
  execution line items!
  
  The patch itself looks fine on the first pass. I do have a high level 
  question on the approach. The patch enables async execution by defaults on 
  client side and adds synchronous wait top of that. This IHMO defeats the 
  purpose of the aysnc execution. From the application point of view, there's 
  no difference in the behavior with or without this patch. The execution 
  will block till the query execution is complete.
  The rationale of async execution is to return control back to client 
  immediately so that the client has an option to perform alternate 
  foreground work while the query is being processed. Have you considered  
  blocking in fetch for queries with resultset ? That in would give more time 
  for server to process the query in parallel while client examines the 
  resultset description and possible interact with end user etc.
 
 Vaibhav Gumashta wrote:
 @Prasad: Thanks a lot for taking a look! I was basically going for a use 
 case when an error can be detected sooner. For example consider the scenario 
 when a stmt.cancel is called from a separate thread:
 Blocking calls:
 When stmt.execute is called, internally HiveStatement sets a variable 
 stmtHandle, to refer to the Operation object that it created for the query 
 execution. However, for a blocking call, the stmtHandle is set only when the 
 call to Operation.run returns. If we call a stmt.cancel (in a different 
 thread) during this, the internal code will check for the value of stmtHandle 
 and return immediately if it is null, essentially doing a no-op. 
 Async calls:
 Since an async call immediately returns from Operation.run, the 
 stmtHandle is set in HiveStatement while the query is still executing. This 
 means, that a stmt.cancel can actually cancel the underlying operation and 
 the original stmt.execute will throw an error when we poll for the operation 
 status (since the underlying operation handle is gone). This can return a 
 useful response to the client sooner.
 
 However, I agree blocking on fetch would be more efficient. Do you think 
 I can take that up in a follow up jira since I may not have a lot of time to 
 work on it for Hive 13 release?
 
 Thanks again!
 
 Thejas Nair wrote:
 Yes, I think it is ok to address that in a follow up jira. This is an 
 optimization, and the changes are useful even without that. What do you think 
 Prasad ?

 
 Prasad Mujumdar wrote:
 Sounds good. 
 @Viabhav, are you planning to log a follow up ticket or I can do that. 
 This will be especially useful for beeline and normal single thread client. 
 Thanks!

@Prasad: I can do that, but feel free to create one if you're planning to work 
sooner. Thanks a lot for reviewing the patch. 


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
---


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-28 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/
---

(Updated March 1, 2014, 12:43 a.m.)


Review request for hive, Prasad Mujumdar and Thejas Nair.


Bugs: HIVE-5232
https://issues.apache.org/jira/browse/HIVE-5232


Repository: hive-git


Description
---

Should be applied on top of:
HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
HIVE-5229 [Better thread management for HiveServer2 async threads]
HIVE-5230 [Better error reporting by async threads in HiveServer2]
HIVE-5441 [Async query execution doesn't return resultset status] 


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 

Diff: https://reviews.apache.org/r/14950/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta



Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-27 Thread Thejas Nair


 On Feb. 18, 2014, 6:41 p.m., Prasad Mujumdar wrote:
  @Vaibhav, thanks for the patch and following up on the overall async 
  execution line items!
  
  The patch itself looks fine on the first pass. I do have a high level 
  question on the approach. The patch enables async execution by defaults on 
  client side and adds synchronous wait top of that. This IHMO defeats the 
  purpose of the aysnc execution. From the application point of view, there's 
  no difference in the behavior with or without this patch. The execution 
  will block till the query execution is complete.
  The rationale of async execution is to return control back to client 
  immediately so that the client has an option to perform alternate 
  foreground work while the query is being processed. Have you considered  
  blocking in fetch for queries with resultset ? That in would give more time 
  for server to process the query in parallel while client examines the 
  resultset description and possible interact with end user etc.
 
 Vaibhav Gumashta wrote:
 @Prasad: Thanks a lot for taking a look! I was basically going for a use 
 case when an error can be detected sooner. For example consider the scenario 
 when a stmt.cancel is called from a separate thread:
 Blocking calls:
 When stmt.execute is called, internally HiveStatement sets a variable 
 stmtHandle, to refer to the Operation object that it created for the query 
 execution. However, for a blocking call, the stmtHandle is set only when the 
 call to Operation.run returns. If we call a stmt.cancel (in a different 
 thread) during this, the internal code will check for the value of stmtHandle 
 and return immediately if it is null, essentially doing a no-op. 
 Async calls:
 Since an async call immediately returns from Operation.run, the 
 stmtHandle is set in HiveStatement while the query is still executing. This 
 means, that a stmt.cancel can actually cancel the underlying operation and 
 the original stmt.execute will throw an error when we poll for the operation 
 status (since the underlying operation handle is gone). This can return a 
 useful response to the client sooner.
 
 However, I agree blocking on fetch would be more efficient. Do you think 
 I can take that up in a follow up jira since I may not have a lot of time to 
 work on it for Hive 13 release?
 
 Thanks again!

Yes, I think it is ok to address that in a follow up jira. This is an 
optimization, and the changes are useful even without that. What do you think 
Prasad ?


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
---


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-27 Thread Thejas Nair


 On Feb. 23, 2014, 12:36 a.m., Thejas Nair wrote:
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java, line 271
  https://reviews.apache.org/r/14950/diff/3/?file=498672#file498672line271
 
  I think we should retry on exceptions that indicate an network 
  connection error. TProtocolException seems to be the exception that is 
  thrown in such cases.

I think we can address this in a follow up patch. Connection error in case of 
the current synchronous execution also would result in failure.


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review35234
---


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-22 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review35234
---



jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
https://reviews.apache.org/r/14950/#comment65691

I think we should retry on exceptions that indicate an network connection 
error. TProtocolException seems to be the exception that is thrown in such 
cases. 


- Thejas Nair


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 20, 2014, 9:18 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-20 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/
---

(Updated Feb. 20, 2014, 9:18 a.m.)


Review request for hive and Thejas Nair.


Repository: hive-git


Description
---

Should be applied on top of:
HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
HIVE-5229 [Better thread management for HiveServer2 async threads]
HIVE-5230 [Better error reporting by async threads in HiveServer2]
HIVE-5441 [Async query execution doesn't return resultset status] 


Diffs (updated)
-

  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 

Diff: https://reviews.apache.org/r/14950/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta



Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-18 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/
---

(Updated Feb. 18, 2014, 2:22 p.m.)


Review request for hive and Thejas Nair.


Changes
---

Rebased on trunk.


Repository: hive-git


Description
---

Should be applied on top of:
HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
HIVE-5229 [Better thread management for HiveServer2 async threads]
HIVE-5230 [Better error reporting by async threads in HiveServer2]
HIVE-5441 [Async query execution doesn't return resultset status] 


Diffs (updated)
-

  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 

Diff: https://reviews.apache.org/r/14950/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta



Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-18 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
---


@Vaibhav, thanks for the patch and following up on the overall async execution 
line items!

The patch itself looks fine on the first pass. I do have a high level question 
on the approach. The patch enables async execution by defaults on client side 
and adds synchronous wait top of that. This IHMO defeats the purpose of the 
aysnc execution. From the application point of view, there's no difference in 
the behavior with or without this patch. The execution will block till the 
query execution is complete.
The rationale of async execution is to return control back to client 
immediately so that the client has an option to perform alternate foreground 
work while the query is being processed. Have you considered  blocking in fetch 
for queries with resultset ? That in would give more time for server to process 
the query in parallel while client examines the resultset description and 
possible interact with end user etc.

- Prasad Mujumdar


On Feb. 18, 2014, 2:22 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 18, 2014, 2:22 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2014-02-18 Thread Vaibhav Gumashta


 On Feb. 18, 2014, 6:41 p.m., Prasad Mujumdar wrote:
  @Vaibhav, thanks for the patch and following up on the overall async 
  execution line items!
  
  The patch itself looks fine on the first pass. I do have a high level 
  question on the approach. The patch enables async execution by defaults on 
  client side and adds synchronous wait top of that. This IHMO defeats the 
  purpose of the aysnc execution. From the application point of view, there's 
  no difference in the behavior with or without this patch. The execution 
  will block till the query execution is complete.
  The rationale of async execution is to return control back to client 
  immediately so that the client has an option to perform alternate 
  foreground work while the query is being processed. Have you considered  
  blocking in fetch for queries with resultset ? That in would give more time 
  for server to process the query in parallel while client examines the 
  resultset description and possible interact with end user etc.

@Prasad: Thanks a lot for taking a look! I was basically going for a use case 
when an error can be detected sooner. For example consider the scenario when a 
stmt.cancel is called from a separate thread:
Blocking calls:
When stmt.execute is called, internally HiveStatement sets a variable 
stmtHandle, to refer to the Operation object that it created for the query 
execution. However, for a blocking call, the stmtHandle is set only when the 
call to Operation.run returns. If we call a stmt.cancel (in a different thread) 
during this, the internal code will check for the value of stmtHandle and 
return immediately if it is null, essentially doing a no-op. 
Async calls:
Since an async call immediately returns from Operation.run, the stmtHandle is 
set in HiveStatement while the query is still executing. This means, that a 
stmt.cancel can actually cancel the underlying operation and the original 
stmt.execute will throw an error when we poll for the operation status (since 
the underlying operation handle is gone). This can return a useful response to 
the client sooner.

However, I agree blocking on fetch would be more efficient. Do you think I can 
take that up in a follow up jira since I may not have a lot of time to work on 
it for Hive 13 release?

Thanks again!


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
---


On Feb. 18, 2014, 2:22 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/14950/
 ---
 
 (Updated Feb. 18, 2014, 2:22 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Should be applied on top of:
 HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
 HIVE-5229 [Better thread management for HiveServer2 async threads]
 HIVE-5230 [Better error reporting by async threads in HiveServer2]
 HIVE-5441 [Async query execution doesn't return resultset status] 
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
 
 Diff: https://reviews.apache.org/r/14950/diff/
 
 
 Testing
 ---
 
 TestJdbcDriver2
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default

2013-10-25 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/
---

Review request for hive and Thejas Nair.


Repository: hive-git


Description
---

Should be applied on top of:
HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
HIVE-5229 [Better thread management for HiveServer2 async threads]
HIVE-5230 [Better error reporting by async threads in HiveServer2]
HIVE-5441 [Async query execution doesn't return resultset status] 


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 2912ece 

Diff: https://reviews.apache.org/r/14950/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta