[jira] [Commented] (IMPALA-3336) psycopg2 as used by qgen prefetches rows, can consume too much memory

Michael Brown (JIRA) Fri, 24 May 2019 13:19:53 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16847870#comment-16847870
 ]


Michael Brown commented on IMPALA-3336:
---------------------------------------

I haven't looked at this in a long time, but if it helps anyone, I came across 
this today http://initd.org/psycopg/docs/faq.html#best-practices , which says:
bq. *When should I save and re-use a cursor as opposed to creating a new one as 
needed?*
bq. Cursors are lightweight objects and creating lots of them should not pose 
any kind of problem. But note that cursors used to fetch result sets will cache 
the data and use memory in proportion to the result set size. Our suggestion is 
to almost always create a new cursor and dispose old ones as soon as the data 
is not required anymore (call close() on them.) The only exception are tight 
loops where one usually use the same cursor for a whole bunch of INSERTs or 
UPDATEs. 
I *believe* in the RQG the cursors are kept persistent.

> psycopg2 as used by qgen prefetches rows, can consume too much memory
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-3336
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3336
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 2.6.0
>            Reporter: Michael Brown
>            Priority: Minor
>
> When I was experimenting with the query generator on EC2, I started having 
> problems with the runner process being killed via kernel oom_killer. In 
> checking in with Taras, he let me know this was happening to him as well. I 
> need to figure out why that is happening so that the query generator can run.
> {noformat}
> 16:07:51 16:07:52 Query execution thread 
> run_query_internal_i80swzzi4opv3q0p:hiveserver2[307]:Fetching up to 2000 
> result rows
> 16:07:52 16:07:52 Query execution thread 
> run_query_internal_i80swzzi4opv3q0p:hiveserver2[307]:Fetching up to 2000 
> result rows
> 16:07:52 16:07:52 Query execution thread 
> run_query_internal_i80swzzi4opv3q0p:hiveserver2[307]:Fetching up to 2000 
> result rows
> 16:08:28 /tmp/hudson351019564031822959.sh: line 11:  4588 Killed              
>     ./tests/comparison/leopard/job.py
> 16:08:28 Build step 'Execute shell' marked build as failure
> 16:08:31 Finished: FAILURE
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-3336) psycopg2 as used by qgen prefetches rows, can consume too much memory

Reply via email to