[jira] [Commented] (NIFI-7086) oracle db read is slow (for me its bug)
[ https://issues.apache.org/jira/browse/NIFI-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027202#comment-17027202 ] Andy LoPresto commented on NIFI-7086: - Hi Naveen, I'm sorry to hear you are having performance issues with this processor. Please feel free to reach out to the us...@nifi.apache.org mailing list to discuss your configuration, hardware & platform specifications, etc. with a broad community of NiFi developers and users. However, we have seen much higher performance for this processor in many scenarios, and this does not qualify as a critical issue. I am lowering the severity of this issue, and unless we can determine a code or configuration reason for this problem, in 7 days I will close the issue. Thank you. > oracle db read is slow (for me its bug) > --- > > Key: NIFI-7086 > URL: https://issues.apache.org/jira/browse/NIFI-7086 > Project: Apache NiFi > Issue Type: Bug > Environment: nifi 1.8.0 >Reporter: naveen kumar saharan >Priority: Critical > > I am not able to fetch oracle db for 1 billion record table. It is taking too > much time(17 hours). > I tried creating queries based on dates using executesql -> > generatetablefetch -> executesql to parallel execution > small tables also performs slow as compared to python database table fetch > program around 20 times slower. This is very disapppointing. > querydatabasetable runs only on primary node with , if i increase thread it > give duplicate data. > Then what is the use of concurrent thread? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (NIFI-7086) oracle db read is slow (for me its bug)
[ https://issues.apache.org/jira/browse/NIFI-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027039#comment-17027039 ] Shawn Weeks commented on NIFI-7086: --- Did you increase your fetch size in the DBCP Connection Pool. The default is really low in the Oracle JDBC driver and make fetches take forever. Try adding a property to your connection pool called defaultRowPrefetch and set it to 1000. That should make a huge difference. I think in newer versions of NiFi we will allow setting the fetch size. Once you've done that you can look into how to split the data into manageable chunks because fetching 1 billion rows is going to take forever no matter what you do. The solution I used is a combination of distributed map cache and date calculations to break the fetches into 30 day chunks with the idea being you query the database for for ranges and then uses those ranges to fetch the data in parts. If you get on Slack and reach out to me I can try and walk you through it i'm usually on there. > oracle db read is slow (for me its bug) > --- > > Key: NIFI-7086 > URL: https://issues.apache.org/jira/browse/NIFI-7086 > Project: Apache NiFi > Issue Type: Bug > Environment: nifi 1.8.0 >Reporter: naveen kumar saharan >Priority: Critical > > I am not able to fetch oracle db for 1 billion record table. It is taking too > much time(17 hours). > I tried creating queries based on dates using executesql -> > generatetablefetch -> executesql to parallel execution > small tables also performs slow as compared to python database table fetch > program around 20 times slower. This is very disapppointing. > querydatabasetable runs only on primary node with , if i increase thread it > give duplicate data. > Then what is the use of concurrent thread? > -- This message was sent by Atlassian Jira (v8.3.4#803005)