[jira] [Commented] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

Vaibhav Gumashta (JIRA) Fri, 17 Feb 2017 15:51:06 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872755#comment-15872755
 ]


Vaibhav Gumashta commented on HIVE-14901:
-----------------------------------------

[~norrisl] Thanks for the patch. My feedback:
# Address the case when thrift serde is not used (basically have an upper bound 
check in the fetch call). This is to prevent OOMs on server if a user passes an 
abnormally large value.
# For passing the user value to thrift serde, check HiveSessionImpl# 
configureSession.
# Small nit: the name of  HIVE_SERVER2_RESULTSET_DEFAULT_FETCH_SIZE/ 
hive.server2.resultset.default.fetch.size to be consistent with the other 
params. I think this won't be backward incompat as this hasn't gone into any 
release yet (but a quick check on the released 2.1 line would be great).
# Log the MAX_BUFFERED_ROWS in serde at INFO level.
# You might want to think of additional tests if you can to test these changes 
and if they are needed.

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-14901
>                 URL: https://issues.apache.org/jira/browse/HIVE-14901
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2, JDBC, ODBC
>    Affects Versions: 2.1.0
>            Reporter: Vaibhav Gumashta
>            Assignee: Norris Lee
>         Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

Reply via email to