belugabehr commented on pull request #1029:
URL: https://github.com/apache/hive/pull/1029#issuecomment-638838456


   @dengzhhu653 By default (Hive 2.3 and earlier), beeline buffers all of the 
results before displaying them to the user.
   
   With a query like `select * from a limit 500000`, if there are that many 
rows in the table, it will have to buffer 500_000 rows and fit them all into 
512MB of memory (please all the other stuff beeline stores).  And let's be 
honest, no human is going to look through that many rows manually.  
   
   You're better off disabling buffering, and streaming out the results to a 
CSV for further processing:
   
   Something like:
   `beeline -u "jdbc:hive2://..." -e "select * from a limit 500000" 
--outputformat=csv2 --incremental=true > results.csv`
   
   
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=82903124#HiveServer2Clients-Separated-ValueOutputFormats
   
   
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions
   
   This is not a problem with Thrift.  Maybe Thrift should better handle the 
OOM Exception, but that will have to be addressed in that project, not Hive.  
As long as the OOM Exception is being propagated up, and the Statement is 
closed (which I believe it is) then Beeline is handling this appropriately.
   
   I just visually traced the trunk code and it looks like the OOM is being 
handled correctly. Details should be printed with verbose logging enabled.  I 
think this method is a bit spaghetti and needs some TLC, but it should be 
working.
   
   I am not in favor of this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to