[
https://issues.apache.org/jira/browse/HIVE-23526?focusedWorklogId=441269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-441269
]
ASF GitHub Bot logged work on HIVE-23526:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Jun/20 13:13
Start Date: 04/Jun/20 13:13
Worklog Time Spent: 10m
Work Description: belugabehr edited a comment on pull request #1029:
URL: https://github.com/apache/hive/pull/1029#issuecomment-638838456
@dengzhhu653 By default (Hive 2.3 and earlier), beeline buffers all of the
results before displaying them to the user.
With a query like `select * from a limit 500000`, if there are that many
rows in the table, it will have to buffer 500_000 rows and fit them all into
512MB of memory (plus all the other stuff beeline stores). And let's be
honest, no human is going to look through that many rows manually.
You're better off disabling buffering, and streaming out the results to a
CSV for further processing:
Something like:
`beeline -u "jdbc:hive2://..." -e "select * from a limit 500000"
--outputformat=csv2 --incremental=true > results.csv`
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=82903124#HiveServer2Clients-Separated-ValueOutputFormats
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions
This is not a problem with Thrift. Maybe Thrift should better handle the
OOM Exception, but that will have to be addressed in that project, not Hive.
As long as the OOM Exception is being propagated up, and the Statement is
closed (which I believe it is) then Beeline is handling this appropriately.
I just visually traced the trunk code and it looks like the OOM is being
handled correctly. Details should be printed with verbose logging enabled. I
think this method is a bit spaghetti and needs some TLC, but it should be
working.
I am not in favor of this change.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 441269)
Time Spent: 4h 40m (was: 4.5h)
> Beeline may throw the misleading exception
> ------------------------------------------
>
> Key: HIVE-23526
> URL: https://issues.apache.org/jira/browse/HIVE-23526
> Project: Hive
> Issue Type: Improvement
> Components: Beeline
> Environment: Hive 1.2.2
> Reporter: Zhihua Deng
> Priority: Minor
> Labels: pull-request-available
> Attachments: HIVE-23526.2.patch, HIVE-23526.3.patch,
> HIVE-23526.patch, outofsequence.log
>
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> Sometimes we can see 'out of sequence response' message in beeline, for
> example:
> Error: org.apache.thrift.TApplicationException: CloseOperation failed: out of
> sequence response (state=08S01,code=0)
> java.sql.SQLException: org.apache.thrift.TApplicationException:
> CloseOperation failed: out of sequence response
> at
> org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:198)
> at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:217)
> at org.apache.hive.beeline.Commands.execute(Commands.java:891)
> at org.apache.hive.beeline.Commands.sql(Commands.java:713)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:976)
> at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:816)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:774)
> at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:487)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:470)
> and there is no other usage message to figured it out, even with --verbose,
> this makes problem puzzled as beeline does not have concurrency problem on
> underlying thrift transport.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)