[ 
https://issues.apache.org/jira/browse/HIVE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-13770:
------------------------------------
    Issue Type: Sub-task  (was: Improvement)
        Parent: HIVE-12427

> Improve Thrift result set streaming when serializing thrift ResultSets in 
> tasks
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-13770
>                 URL: https://issues.apache.org/jira/browse/HIVE-13770
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Holman Lan
>
> When serializing the Thrift result set in final task, i.e. the 
> hive.server2.thrift.resultset.serialize.in.tasks property is set to true, HS2 
> does not start sending the results until the entire result set has been 
> written to HDFS.
> This is not efficient and we should find a way for HS2 to start sending the 
> results as soon as a block of result becomes available. The advantage for 
> this is two folds. One, the client can start consuming the results much 
> sooner. Two, we can start reclaiming the storage space in HDFS used by a 
> particular result set block as soon as the result set block has been 
> successfully sent to the client.
> It's worth checking if this is also the case when not serializing the Thrift 
> result set in final task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to