[ 
https://issues.apache.org/jira/browse/IMPALA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102619#comment-17102619
 ] 

Sahil Takiar commented on IMPALA-9370:
--------------------------------------

[~tarmstrong] thats a fair comment. Need to think about this some more.

One follow up from IMPALA-9199 is that we might want to re-consider what 
packages each of the files are in. Right now the logic is: impala-server 
(src/service) -> query-driver (src/runtime) -> client-request-state 
(src/service) -> coordinator (src/runtime)

Which seems confusing.

> Re-factor ImpalaServer, ClientRequestState, Coordinator protocol
> ----------------------------------------------------------------
>
>                 Key: IMPALA-9370
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9370
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> All of these classes need to be updated to support transparent query retries, 
> and each one could due with some re-factoring so that query retries don't 
> make this code even more complex. For now, I'm going to list out some ideas / 
> suggestions:
>  * Rename ImpalaServer to ImpalaService, I think ImpalaServer is a bit of a 
> misnomer because Impala isn't implementing its own server (it uses Thrift for 
> that) instead it is providing a "service" to end users - this name is 
> consistent with Thrift "service"s as well
>  * Split up ClientRequestState - I'm not sure I fully understand what 
> ClientRequestState is suppose to encapsulate - perhaps originally it captured 
> the state of the actual client request as well as some helper code, but it 
> seems to have evolved over time; it doesn't really look like a purely 
> "stateful" object any more (e.g. it manages admission control submission)
> One possible end state could be:
> ImpalaService <–> QueryDriver (has a ClientRequestState that is not exposed 
> externally) <–> QueryInstance <–> Coordinator
> The QueryDriver is responsible for E2E execution of a query, including all 
> stages such as parsing / planning of a query, submission to admission 
> control, and backend execution. A QueryInstance is a single instance of a 
> query, this is necessary for query retry support since a single query can be 
> run multiple times. The Coordinator remains mostly the same - it is purely 
> responsible for *backend* coordination / execution of a query.
> This provides an opportunity to move a lot of the execution specific logic 
> out of ImpalaServer and into QueryDriver. Currently, ImpalaServer is 
> responsible for submitting the query to the fe/ and then passing the result 
> to the ClientRequestState which submits it for admission control (and 
> eventually the Coordinator for execution).
> QueryDriver encapsulates the E2E execution of a query (starting from a query 
> string, and then returning the results of a query) (inspired by Hive's 
> IDriver interface - 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/IDriver.java]).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to