[
https://issues.apache.org/jira/browse/HIVE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290014#comment-15290014
]
Vaibhav Gumashta commented on HIVE-13770:
-----------------------------------------
cc [~thejas] [~gopalv]
> Improve Thrift result set streaming when serializing thrift ResultSets in
> tasks
> -------------------------------------------------------------------------------
>
> Key: HIVE-13770
> URL: https://issues.apache.org/jira/browse/HIVE-13770
> Project: Hive
> Issue Type: Sub-task
> Reporter: Holman Lan
>
> When serializing the Thrift result set in final task, i.e. the
> hive.server2.thrift.resultset.serialize.in.tasks property is set to true, HS2
> does not start sending the results until the entire result set has been
> written to HDFS.
> This is not efficient and we should find a way for HS2 to start sending the
> results as soon as a block of result becomes available. The advantage for
> this is two folds. One, the client can start consuming the results much
> sooner. Two, we can start reclaiming the storage space in HDFS used by a
> particular result set block as soon as the result set block has been
> successfully sent to the client.
> It's worth checking if this is also the case when not serializing the Thrift
> result set in final task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)