[jira] [Commented] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

Sergey Shelukhin (JIRA) Mon, 08 Sep 2014 16:14:34 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126274#comment-14126274
 ]


Sergey Shelukhin commented on HIVE-7926:
----------------------------------------

This is not well-defined at this point. Initially it may just be scans and 
projections; gradually, filters, partial aggregates etc. can be added. APIs 
will be similar to current task interaction in Hive, but as request-response 
instead of DAG edge - request for data in (with gradually expanding set of 
features as described above), streaming data out. Data can be streamed to a 
task running in YARN container (for example, for shuffle join, order by, etc.). 
In some cases (e.g. aggregates, limit, etc.) perhaps data can be streamed 
directly to query orchestration (AM) or even to client, to avoid the overhead 
of extra task.
Simple (although maybe imprecise) way to think about it is that we can replace 
current map tasks with this.
Precise decisions about specific things will probably be made based on 
stats/heuristics, similar to current decision to do a map join, for example.

> long-lived daemons for query fragment execution, I/O and caching
> ----------------------------------------------------------------
>
>                 Key: HIVE-7926
>                 URL: https://issues.apache.org/jira/browse/HIVE-7926
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: LLAPdesigndocument.pdf
>
>
> We are proposing a new execution model for Hive that is a combination of 
> existing process-based tasks and long-lived daemons running on worker nodes. 
> These nodes can take care of efficient I/O, caching and query fragment 
> execution, while heavy lifting like most joins, ordering, etc. can be handled 
> by tasks.
> The proposed model is not a 2-system solution for small and large queries; 
> neither it is a separate execution engine like MR or Tez. It can be used by 
> any Hive execution engine, if support is added; in future even external 
> products (e.g. Pig) can use it.
> The document with high-level design we are proposing will be attached shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7926) long-lived daemons for query fragment execution, I/O and caching

Reply via email to