[
https://issues.apache.org/jira/browse/NIFI-8605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448505#comment-17448505
]
Wang Qingming commented on NIFI-8605:
-------------------------------------
The default of postgresql and mysql is to read all data sets at once, then
traverse through the resultSet object.
We can make the JDBC driver stream the results instead of buffer them all。The
pseudo code is as follows:
if (isPostgresql()) {
con.setAutoCommit(false);
st.setFetchSize(10000);
}
//mysql
if (isMysql()) {
st.setFetchSize(Integer.MIN_VALUE);
}
> ExecuteSQLRecord processor consumes a large heap volume when use with
> PostgreSQL JDBC driver
> --------------------------------------------------------------------------------------------
>
> Key: NIFI-8605
> URL: https://issues.apache.org/jira/browse/NIFI-8605
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 1.13.2
> Reporter: Vibhath Arunapriya Ileperuma
> Assignee: Vibhath Arunapriya Ileperuma
> Priority: Major
> Labels: Beginner, beginner
> Attachments: GC.LOG
>
>
> I'm using ExecuteSQLRecord processor to query from PostgreSQL. A 'select'
> query I'm using can return more than 60 million rows. I have configured the
> fetch size to 1000 to avoid fetching all the data into memory at once.
> But when the processor is started, heap starts to grow very fast. I have
> configured to NIFI to have 50GB heap size and even that amount is filled
> within minutes. When the heap is filled Garbage collector tries to clean the
> heap blocking other threads.
> It seems like NIFI loads all the data to memory even though fetch size is set
> to 1000. I have attached the NIFI's GC log here with this ticket for
> reference.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)