[jira] [Commented] (NIFI-7086) oracle db read is slow (for me its bug)

Shawn Weeks (Jira) Thu, 30 Jan 2020 14:08:19 -0800


    [ 
https://issues.apache.org/jira/browse/NIFI-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027039#comment-17027039
 ]


Shawn Weeks commented on NIFI-7086:
-----------------------------------

Did you increase your fetch size in the DBCP Connection Pool. The default is 
really low in the Oracle JDBC driver and make fetches take forever. Try adding 
a property to your connection pool called defaultRowPrefetch and set it to 
1000. That should make a huge difference. I think in newer versions of NiFi we 
will allow setting the fetch size. Once you've done that you can look into how 
to split the data into manageable chunks because fetching 1 billion rows is 
going to take forever no matter what you do. The solution I used is a 
combination of distributed map cache and date calculations to break the fetches 
into 30 day chunks with the idea being you query the database for for ranges 
and then uses those ranges to fetch the data in parts. If you get on Slack and 
reach out to me I can try and walk you through it i'm usually on there.

> oracle db read is slow (for me its bug)
> ---------------------------------------
>
>                 Key: NIFI-7086
>                 URL: https://issues.apache.org/jira/browse/NIFI-7086
>             Project: Apache NiFi
>          Issue Type: Bug
>         Environment:  nifi 1.8.0
>            Reporter: naveen kumar saharan
>            Priority: Critical
>
> I am not able to fetch oracle db for 1 billion record table. It is taking too 
> much time(17 hours).
> I tried creating queries based on dates using executesql -> 
> generatetablefetch -> executesql to parallel execution
> small tables also performs slow as compared to python database table fetch 
> program around 20 times slower. This is very disapppointing.
> querydatabasetable runs only on primary node with , if i increase thread it 
> give duplicate data.
> Then what is the use of concurrent thread? 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (NIFI-7086) oracle db read is slow (for me its bug)

Reply via email to