[jira] [Commented] (HIVE-14876) make the number of rows to fetch from various HS2 clients/servers configurable

Vaibhav Gumashta (JIRA) Thu, 06 Oct 2016 01:03:03 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551259#comment-15551259
 ]


Vaibhav Gumashta commented on HIVE-14876:
-----------------------------------------

Following are the details of RPC fetch from jdbc to hs2 and also the confusion 
over {{hive.server2.thrift.resultset.max.fetch.size}}:
When we create a new connection, we use a default value for fetch size if not 
specified in the connection string by the end user. In {{HiveConnection}}:
{code}
private int fetchSize = HiveStatement.DEFAULT_FETCH_SIZE;
{code}

If however, a user specifies the fetch size by using a connection string like 
this: {{jdbc:hive2://localhost:10000/default;fetchSize=10000}}, we override the 
default value with the user supplied value. In {{HiveConnection}}:
{code}
    if (sessConfMap.containsKey(JdbcConnectionParams.FETCH_SIZE)) {
      fetchSize = 
Integer.parseInt(sessConfMap.get(JdbcConnectionParams.FETCH_SIZE));
    }
{code}

When we run a {{HiveStatement.execute}}, we set the fetch size in the result 
set. In {{HiveStatement.execute}}:
{code}
    resultSet =  new 
HiveQueryResultSet.Builder(this).setClient(client).setSessionHandle(sessHandle)
        .setStmtHandle(stmtHandle).setMaxRows(maxRows).setFetchSize(fetchSize)
        .setScrollable(isScrollableResultset)
        .build();
{code}

Finally, when we issue a fetch rpc request, we send this value as part of the 
rpc request. In {{HiveQueryResultSet.next}}:
{code}
TFetchResultsReq fetchReq = new TFetchResultsReq(stmtHandle,
            orientation, fetchSize);
{code}

On the server side, the fetch request hits {{ThriftCLIService.FetchResults}}:
{code}
RowSet rowSet = cliService.fetchResults(
          new OperationHandle(req.getOperationHandle()),
          FetchOrientation.getFetchOrientation(req.getOrientation()),
          req.getMaxRows(),
          FetchType.getFetchType(req.getFetchType()));
{code}
The request eventually reaches {{SQLOperation.getNextRowSet}} which gets the 
fetch size specified in the RPC as the parameter.

Apologize for the confusion regarding 
{{hive.server2.thrift.resultset.max.fetch.size}}, but that is only used when 
ThriftJDBCSerde is used to write resultsets in tasks, to decide how many rows 
to serialize in a blob. I have created a jira for resolving the confusion and 
shall have a patch out soon: HIVE-14901. Meanwhile, to increase the default 
fetch size for the code path that doesn't use ThriftJDBCSerde, we should bump 
the value of HiveStatement.DEFAULT_FETCH_SIZE on the driver side.

cc [~ziyangz]:  you might want to follow the discussion here.

> make the number of rows to fetch from various HS2 clients/servers configurable
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-14876
>                 URL: https://issues.apache.org/jira/browse/HIVE-14876
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14876.patch
>
>
> Right now, it's hardcoded to a variety of values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14876) make the number of rows to fetch from various HS2 clients/servers configurable

Reply via email to