[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-02-08 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859085#comment-15859085
 ] 

Prasanth Jayachandran edited comment on HIVE-15473 at 2/9/17 6:17 AM:
--

Another observation is that cli used to print application and query id in the 
console like below
{code}
Query ID = pjayachandran_20170208172839_9fa7b1c5-5d96-4cc2-9fa2-2f3eda1987e0
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id 
application_1486523081849_0010)
{code}

This will be really useful for debugging. It will be good if we can get these 
information via an API. There is also session id which can be useful. 


was (Author: prasanth_j):
Another observation is that cli used to print application and query id in the 
console like below
{code}
Query ID = pjayachandran_20170208172839_9fa7b1c5-5d96-4cc2-9fa2-2f3eda1987e0
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id 
application_1486523081849_0010)
{code}

This will be really useful for debugging. It will be good if we can get these 
information via an API. 

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, 
> HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, 
> HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, 
> HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, 
> io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, 
> status_before_patch.png, summary_after_patch.png, summary_before_patch.png
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-02-07 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857413#comment-15857413
 ] 

anishek edited comment on HIVE-15473 at 2/8/17 5:39 AM:


[~prasanth_j] the summary sections are no longer printed via the jline rendered 
to retain the color scheme, the reason being the report goes to log file for 
beeline and for hive cli its shown on the stdout, hence had to remove the color 
scheme for same  report . I hope that should be ok ? 

I have created HIVE-15847 for the slow refresh rates on hive cli, will look 
into it. There is no inherent change that was done to the way progress bar is 
printed for hive-cli.Thanks for your inputs!


was (Author: anishek):
[~prasanth_j] the summary sections are no longer printed via the jline rendered 
to retain the color scheme, the reason being the report goes to log file for 
beeline and for hive cli its shown on the stdout, hence had to remove the color 
scheme for same  report . I hope that should be ok ? 

I have created HIVE-15847 for the slow refresh rates on hive cli, will look 
into it. Thanks for your inputs!

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, 
> HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, 
> HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, 
> HIVE-15473.8.patch, HIVE-15473.9.patch, io_summary_after_patch.png, 
> io_summary_before_patch.png, screen_shot_beeline.jpg, status_after_patch.png, 
> status_before_patch.png, summary_after_patch.png, summary_before_patch.png
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-02-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856879#comment-15856879
 ] 

Prasanth Jayachandran edited comment on HIVE-15473 at 2/7/17 10:10 PM:
---

Why is the refresh interval (refreshes every 3 seconds?) for CLI changed? Can 
it be done only for beeline? nit: also one of lower separator seems to be 
missing. 


was (Author: prasanth_j):
Why is the refresh interval (refreshes every 3 seconds?) for CLI changed? Can 
it be done only for beeline?

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15473.10.patch, HIVE-15473.11.patch, 
> HIVE-15473.2.patch, HIVE-15473.3.patch, HIVE-15473.4.patch, 
> HIVE-15473.5.patch, HIVE-15473.6.patch, HIVE-15473.7.patch, 
> HIVE-15473.8.patch, HIVE-15473.9.patch, screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843689#comment-15843689
 ] 

Thejas M Nair edited comment on HIVE-15473 at 1/28/17 12:35 AM:


This sounds good. Some minor comments - 

bq. not sure why the step function is used – to prevent server from wasting CPU 
resources on non-critical operations ?
Yes, too many RPC calls would mean wasted CPU resources (and also bit of 
network bandwitdh). For queries that complete in a few seconds, it makes sense 
to update progress more frequently. But if it is a very large query taking 
minutes, then update of log/progress every 5 seconds is reasonable. 

bq. Merge QueryLog and ProgressBarLog request / response as part of 
GetOperationStatus.
I think this makes sense, however I think we can move the QueryLog 
functionality as a follow up task in another jira to keep the jira smaller and 
easier to review.
In first patch we could add the ProgressBarLog functionality as part of 
GeOperationStatus, instead of introducing a new function.

bq. There will be additional function signature for GetOperationStatus that we 
might need to create to allow for backward compatibility reasons.
If we add the new request and response objects as optional in the thrift api, 
we can keep it backward compatible without adding a new function.




was (Author: thejas):
bq. not sure why the step function is used – to prevent server from wasting CPU 
resources on non-critical operations ?
Yes, too many RPC calls would mean wasted CPU resources (and also bit of 
network bandwitdh). For queries that complete in a few seconds, it makes sense 
to update progress more frequently. But if it is a very large query taking 
minutes, then update of log/progress every 5 seconds is reasonable. 

bq. Merge QueryLog and ProgressBarLog request / response as part of 
GetOperationStatus.
I think this makes sense, however I think we can move the QueryLog 
functionality as a follow up task in another jira to keep the jira smaller and 
easier to review.
In first patch we could add the ProgressBarLog functionality as part of 
GeOperationStatus, instead of introducing a new function.

bq. There will be additional function signature for GetOperationStatus that we 
might need to create to allow for backward compatibility reasons.
If we add the new request and response objects as optional in the thrift api, 
we can keep it backward compatible without adding a new function.



> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.2.patch, HIVE-15473.3.patch, 
> HIVE-15473.4.patch, HIVE-15473.5.patch, screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-26 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842353#comment-15842353
 ] 

anishek edited comment on HIVE-15473 at 1/27/17 7:16 AM:
-

There are few observations / limitations that [~thejas] had cited while 
reviewing this. Writing down the reasoning here and steps of how we can move 
forward.

Given that we use SynchronizedHandler for the client on beeline side, only one 
operation / api at a time can be in execution from a single beeline session to 
hiveserver2. Current flow of how the progress bar is updated on the client side 
is 

Thread 1 -- does statement execution: This is achieved by calling 
GetOperationStatus for the operation from beeline till the execution of the 
operation is complete. The server side implementation of GetOperationStatus 
uses a timeout mechanism (which waits for the query execution to finish), 
before it sends the status to the client. The time value is decided by a step 
function, where for long running queries this can lead to a approx wait time of 
5 seconds per call to GetOperationStatus .
Thread 2 -- prints query Logs and progress logs.

*Problem Space:*
# Since the client synchronizes the various api calls, This effectively means 
that only one api from either Thread 1 / Thread 2 is executed at at time and 
the notion of trying to project concurrent execution capability in code for 
beeline seems misleading and hence with the current patch the progress bar /  
query log updates can be delayed by at least 5+ seconds ( _I dont think we can 
avoid this anyways, as i will discuss later_ ). 
# Additionally, since there is no *order* of threads requesting synchronization 
on a object is maintained, there is a possibility that Thread 1 can get the 
next lock on the object without Thread 2 getting a chance to obtain the lock, 
thus leading to long delays in updating the Query Log or Progress log ( _I am 
not sure how this will happen for use case of long running queries as while 
Thread 1 is executing , Thread 2 would already have blocked on the synchronize 
of the object. Once Thread 1 completes and before it comes around the while 
loop in_   
{code}
HiveStatement.waitForOperationToComplete()
{code}
_Thread 2 should start executing, it seems highly improbable that, thread 1 
completes and executes additional statements and gets the lock again before 
Thread 2 gets a chance to acquire the lock_ )

So in summary:
* Prevent multi threaded code in beeline for interactions with hiveserver2, as 
no concurrency is supported by the Thrift protocol, unless we move to 
ThriftHttpCliService using Http based connection, or use NonBlockingThrift 
server for binary protocol on the server side.
* Address the issue of responsiveness if we can.

*Solution Space:*
Since concurrent execution is not supported programming anything, to that 
effect should be avoided in beeline client. Hence, we strive to remove the 
multi threaded code from beeline side, in effect, moving the query log and 
progress bar log to merge with the GetOperationStatus api. This would still not 
address the issue of responsiveness as indicated in 1. above as the 
GetOperationStatus will use the wait time before responding to calls from 
beeline side, unless we decide to remove this, or reduce the wait time to a 
default value of say 500 milliseconds, not sure why the step function is used 
-- _to prevent server from wasting CPU resources on non-critical operations ?_ 
. This will address 2. above though since we are going to get all the 
information in a single call. 

*Implementation Considerations:*
# Merge QueryLog and ProgressBarLog request / response as part of 
GetOperationStatus.
# To get this working we have to extend HiveStatement to include few non JDBC 
compliant setters ( one interface for displaying progress bar, other for 
displaying query logs) -- default implementations for these will be _do 
nothing_ implementations
# Have setters on hive statement for both the interfaces, used by beeline to 
provide required implementations.
# As part of hive statement execute(*) call, we create appropriate request if 
custom implementations of the interfaces are provided above. 
# There will be additional function signature for GetOperationStatus that we 
might need to create to allow for backward compatibility reasons.
# _Not related to above_ : make sure we pass the vertex progress as string (for 
progress bar display) and query progress as custom enum for decision making(and 
implementations on server side to map from execution engine based state to our 
generic enum state).
 
If we are too worried about the responsiveness of the progress bar, or *2. in 
Problem Space* being a major impediment for hive usage, we should go with the 
new implementation proposal, else we just additionally implement *6. in 
Implementation Considerations*




was (Author: anishek):
There are few 

[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-26 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842353#comment-15842353
 ] 

anishek edited comment on HIVE-15473 at 1/27/17 7:16 AM:
-

There are few observations / limitations that [~thejas] had cited while 
reviewing this. Writing down the reasoning here and steps of how we can move 
forward.

Given that we use SynchronizedHandler for the client on beeline side, only one 
operation / api at a time can be in execution from a single beeline session to 
hiveserver2. Current flow of how the progress bar is updated on the client side 
is 

Thread 1 -- does statement execution: This is achieved by calling 
GetOperationStatus for the operation from beeline till the execution of the 
operation is complete. The server side implementation of GetOperationStatus 
uses a timeout mechanism (which waits for the query execution to finish), 
before it sends the status to the client. The time value is decided by a step 
function, where for long running queries this can lead to a approx wait time of 
5 seconds per call to GetOperationStatus .
Thread 2 -- prints query Logs and progress logs.

*Problem Space:*
# Since the client synchronizes the various api calls, This effectively means 
that only one api from either Thread 1 / Thread 2 is executed at at time and 
the notion of trying to project concurrent execution capability in code for 
beeline seems misleading and hence with the current patch the progress bar /  
query log updates can be delayed by at least 5+ seconds ( _I dont think we can 
avoid this anyways, as i will discuss later_ ). 
# Additionally, since there is no *order* of threads requesting synchronization 
on a object is maintained, there is a possibility that Thread 1 can get the 
next lock on the object without Thread 2 getting a chance to obtain the lock, 
thus leading to long delays in updating the Query Log or Progress log ( _I am 
not sure how this will happen for use case of long running queries as while 
Thread 1 is executing , Thread 2 would already have blocked on the synchronize 
of the object. Once Thread 1 completes and before it comes around the while 
loop in_   
{code}
HiveStatement.waitForOperationToComplete()
{code}
_Thread 2 should start executing, it seems highly improbable that, thread 1 
completes and executes additional statements and gets the lock again before 
Thread 2 gets a chance to acquire the lock_ )

So in summary:
* Prevent multi threaded code in beeline for interactions with hiveserver2, as 
no concurrency is supported by the Thrift protocol, unless we move to 
ThriftHttpCliService using Http based connection, or use NonBlockingThrift 
server for binary protocol on the server side.
* Address the issue of responsiveness if we can.

*Solution Space:*
Since concurrent execution is not supported programming anything, to that 
effect should be avoided in beeline client. Hence, we strive to remove the 
multi threaded code from beeline side, in effect, moving the query log and 
progress bar log to merge with the GetOperationStatus api. This would still not 
address the issue of responsiveness as indicated in 1. above as the 
GetOperationStatus will use the wait time before responding to calls from 
beeline side, unless we decide to remove this, or reduce the wait time to a 
default value of say 500 milliseconds, not sure why the step function is used 
-- _to prevent server from wasting CPU resources on non-critical operations ?_ 
. This will address 2. above though since we are going to get all the 
information in a single call. 

*Implementation Considerations:*
# Merge QueryLog and ProgressBarLog request / response as part of 
GetOperationStatus.
# To get this working we have to extend HiveStatement to include few non JDBC 
compliant setters ( one interface for displaying progress bar, other for 
displaying query logs) -- default implementations for these will be _do 
nothing_ implementations
# Have setters on hive statement for both the interfaces, used by beeline to 
provide required implementations.
# As part of hive statement execute(*) call, we create appropriate request if 
custom implementations of the interfaces are provided above. 
# There will be additional function signature for GetOperationStatus that we 
might need to create to allow for backward compatibility reasons.
# _Not related to above_ : make sure we pass the vertex progress as string (for 
progress bar display) and query progress as custom enum for decision making(and 
implementations on server side to map from execution engine based state to our 
generic enum state).
 
If we are too worried about the responsiveness of the progress bar, or *2. in 
Problem Space* being a major impediment for hive usage, we should go with the 
new implementation proposal else just additionally implement with *5. in 
Implementation Considerations*




was (Author: anishek):
There are few 

[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-12 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820437#comment-15820437
 ] 

anishek edited comment on HIVE-15473 at 1/12/17 8:29 AM:
-

We cant use TOpeartionState as those state names are used to display the states 
in progress bar, we need to allow states as  'java.lang.String' as this will 
allow progress bar for any other execution engine to be displayed, The 
rendering of the progress bar does not care about the HiveServer State 
representations but rather for, the execution engine state representations. 

Since we also need to know on the client side when to stop querying for the 
progressBar the internal execution engine states have to be mapped to 
JobExecutionStatus. For now since progress bar is only for tez the matching 
happens via the 'fromString' method in JobExecutionStatus. Ideally this class 
should do the relevant mapping. Idea about how this could be achieved is here :

{code}

public enum JobExecutionStatus {
  SUBMITTED((short) 0),

  INITING((short) 1),

  RUNNING((short) 2),

  SUCCEEDED((short) 3),

  KILLED((short) 4),

  FAILED((short) 5),

  ERROR((short) 6),

  NOT_AVAILABLE((short) 7);

  private final short executionStatusOrdinal;

  JobExecutionStatus(short executionStatusOrdinal) {
this.executionStatusOrdinal = executionStatusOrdinal;
  }

  public short toExecutionStatus() {
return executionStatusOrdinal;
  }

  public static JobExecutionStatus fromString(String input, StatusFinder 
finder) {
return finder.from(input);
  }

  interface StatusFinder {
JobExecutionStatus from(String inputStatus);
  }

  static class TezStatusFinder implements StatusFinder {

@Override
public JobExecutionStatus from(String inputStatus) {
  for (JobExecutionStatus status : values()) {
if (status.name().equals(inputStatus))
  return status;
  }
  return NOT_AVAILABLE;
}
  }
}
{code}

OR

may be have two state variables in the response for GetProgressUpdate , one as 
String used for display, other a OperationState Object allow us to create 
control flow statements on the caller side.


was (Author: anishek):
We cant use TOpeartionState as those state names are used to display the states 
in progress bar, we need to allow states as  'java.lang.String' as this will 
allow progress bar for any other execution engine to be displayed, The 
rendering of the progress bar does not care about the HiveServer State 
representations but rather for, the execution engine state representations. 

Since we also need to know on the client side when to stop querying for the 
progressBar the internal execution engine states have to be mapped to 
JobExecutionStatus. For now since progress bar is only for tez the matching 
happens via the 'fromString' method in JobExecutionStatus. Ideally this class 
should do the relevant mapping. Idea about how this could be achieved is here :

{code}

public enum JobExecutionStatus {
  SUBMITTED((short) 0),

  INITING((short) 1),

  RUNNING((short) 2),

  SUCCEEDED((short) 3),

  KILLED((short) 4),

  FAILED((short) 5),

  ERROR((short) 6),

  NOT_AVAILABLE((short) 7);

  private final short executionStatusOrdinal;

  JobExecutionStatus(short executionStatusOrdinal) {
this.executionStatusOrdinal = executionStatusOrdinal;
  }

  public short toExecutionStatus() {
return executionStatusOrdinal;
  }

  public static JobExecutionStatus fromString(String input, StatusFinder 
finder) {
return finder.from(input);
  }

  interface StatusFinder {
JobExecutionStatus from(String inputStatus);
  }

  static class TezStatusFinder implements StatusFinder {

@Override
public JobExecutionStatus from(String inputStatus) {
  for (JobExecutionStatus status : values()) {
if (status.name().equals(inputStatus))
  return status;
  }
  return NOT_AVAILABLE;
}
  }
}
{code}

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.2.patch, HIVE-15473.3.patch, 
> HIVE-15473.4.patch, screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-12 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820437#comment-15820437
 ] 

anishek edited comment on HIVE-15473 at 1/12/17 8:27 AM:
-

We cant use TOpeartionState as those state names are used to display the states 
in progress bar, we need to allow states as  'java.lang.String' as this will 
allow progress bar for any other execution engine to be displayed, The 
rendering of the progress bar does not care about the HiveServer State 
representations but rather for, the execution engine state representations. 

Since we also need to know on the client side when to stop querying for the 
progressBar the internal execution engine states have to be mapped to 
JobExecutionStatus. For now since progress bar is only for tez the matching 
happens via the 'fromString' method in JobExecutionStatus. Ideally this class 
should do the relevant mapping. Idea about how this could be achieved is here :

{code}

public enum JobExecutionStatus {
  SUBMITTED((short) 0),

  INITING((short) 1),

  RUNNING((short) 2),

  SUCCEEDED((short) 3),

  KILLED((short) 4),

  FAILED((short) 5),

  ERROR((short) 6),

  NOT_AVAILABLE((short) 7);

  private final short executionStatusOrdinal;

  JobExecutionStatus(short executionStatusOrdinal) {
this.executionStatusOrdinal = executionStatusOrdinal;
  }

  public short toExecutionStatus() {
return executionStatusOrdinal;
  }

  public static JobExecutionStatus fromString(String input, StatusFinder 
finder) {
return finder.from(input);
  }

  interface StatusFinder {
JobExecutionStatus from(String inputStatus);
  }

  static class TezStatusFinder implements StatusFinder {

@Override
public JobExecutionStatus from(String inputStatus) {
  for (JobExecutionStatus status : values()) {
if (status.name().equals(inputStatus))
  return status;
  }
  return NOT_AVAILABLE;
}
  }
}
{code}


was (Author: anishek):
We cant use TOpeartionState as those state names are used to display the states 
in progress bar, we need to allow states as  'java.lang.String' as this will 
allow progress bar for any other execution engine to be displayed, The 
rendering of the progress bar does not care about the HiveServer State 
representations but rather for, the execution engine state representations. 

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.2.patch, HIVE-15473.3.patch, 
> HIVE-15473.4.patch, screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-11 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820134#comment-15820134
 ] 

Thejas M Nair edited comment on HIVE-15473 at 1/12/17 4:27 AM:
---

While reviewing this, one question I had was whether it makes sense to reuse 
GetOperationStatus thrift api for progress notification as well. I think it 
makes sense to follow the approach in this patch of creating a new thrift 
method for more detailed progress information, and keeping GetOperationStatus a 
lightweight call that can be used from jdbc/odbc to get notified about query 
completion.
(Making this statement so that I remember the reasoning in future!)



was (Author: thejas):
While reviewing this, one question I had was whether it makes sense to reuse 
GetOperationStatus thrift api for progress notification as well. I think it 
makes sense to follow the approach in this patch of creating a new thrift 
method for more detailed progress information, and keeping GetOperationStatus a 
lightweight call that can be used from jdbc/odbc to get notified about query 
completion.


> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.2.patch, HIVE-15473.3.patch, 
> HIVE-15473.4.patch, screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-06 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15804361#comment-15804361
 ] 

anishek edited comment on HIVE-15473 at 1/6/17 11:43 AM:
-

Ran couple of tests that were failing in 

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2816/testReport

Specifically :
{code}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join13] (batchId=95)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] 
(batchId=95)
{code}
both of them passed on local machine. ran them as 
{code}
 mvn test -Dtest=TestSparkCliDriver -Dqfile=join_vc.q
{code}
from hive/itests/qtest-spark

after  
{code}
mvn clean install -DskipTests && cd itests && mvn clean install -DskipTests
{code}
Any idea why they might be failing on jenkins ?


was (Author: anishek):
Ran couple of tests that were failing in 

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2816/testReport

Specifically :
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join13] (batchId=95)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_vc] 
(batchId=95)

both of them passed on local machine. ran them as 

 mvn test -Dtest=TestSparkCliDriver -Dqfile=join_vc.q

from hive/itests/qtest-spark

after  mvn clean install -DskipTests && cd itests && mvn clean install 
-DskipTests

Any idea why they might be failing on jenkins ?

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.2.patch, HIVE-15473.3.patch, 
> screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-05 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15803724#comment-15803724
 ] 

anishek edited comment on HIVE-15473 at 1/6/17 6:27 AM:


latest patch  HIVE-15473.patch.1


was (Author: anishek):
latest patch 

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: HIVE-15473.patch, HIVE-15473.patch.1.txt, 
> screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-05 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15803605#comment-15803605
 ] 

anishek edited comment on HIVE-15473 at 1/6/17 5:18 AM:


[~thejas] was in the process of reverting the commit changes, i think all of 
them are complete as of last checkin a few mins ago. i will add the patch file 
as well soon.




was (Author: anishek):
[~thejas] was in the process of reverting the commit changes, i think all of 
them are complete as of last checkin a few mins ago. i will ass the patch file 
as well soon.



> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: screen_shot_beeline.jpg
>
>
> Hive Cli allows showing progress bar for tez execution engine as shown in 
> https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif
> it would be great to have similar progress bar displayed when user is 
> connecting via beeline command line client as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15473) Progress Bar on Beeline client

2017-01-05 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766646#comment-15766646
 ] 

anishek edited comment on HIVE-15473 at 1/5/17 9:23 AM:


Given:
Currently there is tez and spark execution engines which have the ability to 
show the progress bar. The Progress bar information is almost similar with few 
label differences in both the representations.

There are two options of implementing this:
common changes for both approaches: Have a interface in hive-exec which returns 
this generic data-structure which is implemented by various execution engines 
currently tez/spark. Additional api on the ThriftCliService to get the data 
structure based on the execution engine.

First:

Have a generalized view printer on the beeline side which has a well defined 
format which can be derived from the existing representation @   
https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif  
Create a data-structure tied closely to the above view, to hold the relevant 
information such that the view printer above can use it to fill in the required 
details.


Pros:
We dont need to know on the client side what execution engine the server was 
using the process the query as the representations are same for both.
Faster development time since currently we are going to assume the display 
Format on the beeline side and hence this will have ripple effect on the design 
of API's on server side as well.
 
Cons:
We will be only able to represent the progress bar for any future execution 
engines in the  same format as for tez/Spark, which might / might not fit the 
needs of other engines?

Second:
Have a basic map of key value pairs as the serialized data structure from 
server to beeline
Have specific view implementations based on the execution engine ( which will 
be sent in the above map )  on the beeline side. The server / beeline have to 
have a common understanding of what various keys mean in the map per execution 
engine.

Pros:
Allows us a great deal of flexibility as to how the progress view has to be 
implemented on the beeline side for any execution engine.

Cons:
Longer development time.
May be over engineering for this since we dont get support for a new Execution 
engine everyday.


Any preferences / suggestions ?


was (Author: anishek):
Give:
Currently there is tez and spark execution engines which have the ability to 
show the progress bar. The Progress bar information is almost similar with few 
label differences in both the representations.

There are two options of implementing this:
common changes for both approaches: Have a interface in hive-exec which returns 
this generic data-structure which is implemented by various execution engines 
currently tez/spark. Additional api on the ThriftCliService to get the data 
structure based on the execution engine.

First:

Have a generalized view printer on the beeline side which has a well defined 
format which can be derived from the existing representation @   
https://issues.apache.org/jira/secure/attachment/12678767/ux-demo.gif  
Create a data-structure tied closely to the above view, to hold the relevant 
information such that the view printer above can use it to fill in the required 
details.


Pros:
We dont need to know on the client side what execution engine the server was 
using the process the query as the representations are same for both.
Faster development time since currently we are going to assume the display 
Format on the beeline side and hence this will have ripple effect on the design 
of API's on server side as well.
 
Cons:
We will be only able to represent the progress bar for any future execution 
engines in the  same format as for tez/Spark, which might / might not fit the 
needs of other engines?

Second:
Have a basic map of key value pairs as the serialized data structure from 
server to beeline
Have specific view implementations based on the execution engine ( which will 
be sent in the above map )  on the beeline side. The server / beeline have to 
have a common understanding of what various keys mean in the map per execution 
engine.

Pros:
Allows us a great deal of flexibility as to how the progress view has to be 
implemented on the beeline side for any execution engine.

Cons:
Longer development time.
May be over engineering for this since we dont get support for a new Execution 
engine everyday.


Any preferences / suggestions ?

> Progress Bar on Beeline client
> --
>
> Key: HIVE-15473
> URL: https://issues.apache.org/jira/browse/HIVE-15473
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, HiveServer2
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Attachments: screen_shot_beeline.jpg
>
>
> Hive Cli allows