[
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872755#comment-15872755
]
Vaibhav Gumashta commented on HIVE-14901:
-----------------------------------------
[~norrisl] Thanks for the patch. My feedback:
# Address the case when thrift serde is not used (basically have an upper bound
check in the fetch call). This is to prevent OOMs on server if a user passes an
abnormally large value.
# For passing the user value to thrift serde, check HiveSessionImpl#
configureSession.
# Small nit: the name of HIVE_SERVER2_RESULTSET_DEFAULT_FETCH_SIZE/
hive.server2.resultset.default.fetch.size to be consistent with the other
params. I think this won't be backward incompat as this hasn't gone into any
release yet (but a quick check on the released 2.1 line would be great).
# Log the MAX_BUFFERED_ROWS in serde at INFO level.
# You might want to think of additional tests if you can to test these changes
and if they are needed.
> HiveServer2: Use user supplied fetch size to determine #rows serialized in
> tasks
> --------------------------------------------------------------------------------
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
> Issue Type: Sub-task
> Components: HiveServer2, JDBC, ODBC
> Affects Versions: 2.1.0
> Reporter: Vaibhav Gumashta
> Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch,
> HIVE-14901.3.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide
> the max number of rows that we write in tasks. However, we should ideally use
> the user supplied value (which can be extracted from the
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to
> serialize in a blob in the tasks. We should however use
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on
> it, so that we don't go OOM in tasks and HS2.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)