Re: How I put the cluster down.
Hi, I'd suggest couple things. Have you configured backpressure controls on connections? NiFi 1.0.0 adds 1evt/1GB by default IIRC. This can help avoid overwhelming components in a flow. Next, the 2 core CPU is really inadequate for high throughput system, see if you can get something better. It seems there's a lot going on in your cluster. A full NiFi node with many flows does a lot of housekeeping in the background, needs some power. Andrew On Fri, Oct 28, 2016, 8:36 AM Alessio Palmawrote: > Hello Witt, > before anything else thanks for your help. > Fortunatly I put down only the NIFI cluster, otherwise I was already in > vacation :) > > After I posted this problem I kept to torture staging NIFI and > discovered that when CPU LOAD gets very high, nodes loose connection and > anything starts going in the bad directory. Also the WEB GUI becomes not > responsive, you have no option to stop workflows. > > You can reproduce this issue starting some workflows composed by > 1) GenerateFlowFile ( 1 Kb size, Timer driven, 0 sec run schedule ) > 2) ReplaceText ( just to force the use of regexp ) > 3) HashContent, ( auto terminate both relationships ) > > Currently my staging cluster is composed by 2 virtual host configured as: > 2 Core cpu ( Intel(R) Xeon(R) CPU E7- 2870 @ 2.40GHz ) > 2 GB RAM > 18 GB HD > > The problem raised when the CPU load goes over 8, this basically means > when you start 8 of the above WF. > > I noticed NIFI attempts to reduce the load but this does not works too > much and does not avoid the general failure. > > Here you can see the errors which started to show under stress: > > https://drive.google.com/drive/folders/0B7NTMIqrCjESN0JURnRtZWp5Tms?usp=sharing > > > The 1st question is: is here a way to keep the load under some critical > values? Is there some "how to" which helps me to configure NIFI ? > Currently it is using the factory settings and no customization has been > performed but LDAP login. > > AP > > > > On 28/10/2016 13:24, Joe Witt wrote: > > Alessio > > > > You have two clusters here potentially. The NiFi cluster and the > > Hadoop cluster. Which one went down? > > > > If NiFi went down I'd suspect memory exhaustion issues because other > > resource exhaustion issues like full file system, exhausted file > > handles, pegged CPU, etc.. tend not to cause it to restart. If memory > > related you'll probably see something in the nifi-app.log. Try going > > with a larger heap as can be controlled in conf/bootstrap.conf. > > > > Thanks > > Joe > > > > On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palma > > wrote: > >> Hello all, > >> yesterday, for a mistake, basically I executed " ls -R / " using the > >> ListHDFS processor and the whole cluster gone down ( not just a node ). > >> > >> Something like this also happened when I was playing with some DO WHILE > >> / WHILE DO patterns. I have only the nifi logs and they show the > >> heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no > >> info. Any pointers about where do I have look for the problem's root ? > >> > >> Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad > >> is happening although CPU LOAD is enough high and NETWORK TRAFFIC > >> increased up to 282 Kb/sec. > >> > >> Of course I can redo the "ls -R /" on production, however I like to > >> avoid it since there are already some ingestion flows running. > >> > >> AP > > . > > >
Re: nifi is running out of memory
Thanks James.. I am looking into permission issue and update the thread. I will also make the changes as you per your recommendation. On Fri, Oct 28, 2016 at 10:23 AM, James Wingwrote: > From the screenshot and the error message, I interpret the sequence of > events to be something like this: > > 1.) ListS3 succeeds and generates flowfiles with attributes referencing S3 > objects, but no content (0 bytes) > 2.) FetchS3Object fails to pull the S3 object content with an Access > Denied error, but the failed flowfiles are routed on to PutS3Object (35,179 > files / 0 bytes in the "putconnector" queue) > 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3 > > I recommend a couple thing for FetchS3Object: > > * Only allow the "success" relationship to continue to PutS3Object. > Separate the "failure" relationship to either loop back to FetchS3Object or > go to a LogAttibute processor, or other handling path. > * It looks like the permissions aren't working, you might want to > double-check the access keys or try a sample file with the AWS CLI. > > Thanks, > > James > > > On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr wrote: > >> This is how my nifi flow looks like. >> >> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr wrote: >> >>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching >>> to 0.71. Now it is able to list the files from buckets and create those >>> files in the another bucket. But write is not happening and I am getting >>> the permission issue ( I have attached below for the reference) Could this >>> be the setting of the buckets or it has more to do with the access key. All >>> the files which are creaetd in the new bucket are of 0 byte. >>> Thanks >>> Rai >>> >>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] >>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed >>> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy >>> yyy,claim=,offset=0,name=x.gz,size=0]; routing to failure: >>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied >>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; >>> Request ID: xxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r >>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4= >>> >>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] >>> o.a.nifi.processors.aws.s3.FetchS3Object >>> >>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied >>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; >>> Request ID: 0F34E71C0697B1D8) >>> >>> at >>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) >>> ~[aws-java-sdk-core-1.10.32.jar:na] >>> >>> at >>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) >>> ~[aws-java-sdk-core-1.10.32.jar:na] >>> >>> at >>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) >>> ~[aws-java-sdk-core-1.10.32.jar:na] >>> >>> at >>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) >>> ~[aws-java-sdk-core-1.10.32.jar:na] >>> >>> at >>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) >>> ~[aws-java-sdk-s3-1.10.32.jar:na] >>> >>> at >>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) >>> ~[aws-java-sdk-s3-1.10.32.jar:na] >>> >>> at >>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) >>> ~[nifi-aws-processors-0.7.1.jar:0.7.1] >>> >>> at >>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) >>> [nifi-api-0.7.1.jar:0.7.1] >>> >>> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S >>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1] >>> >>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask >>> .call(ContinuallyRunProcessorTask.java:136) >>> [nifi-framework-core-0.7.1.jar:0.7.1] >>> >>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask >>> .call(ContinuallyRunProcessorTask.java:47) >>> [nifi-framework-core-0.7.1.jar:0.7.1] >>> >>> at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA >>> gent$1.run(TimerDrivenSchedulingAgent.java:127) >>> [nifi-framework-core-0.7.1.jar:0.7.1] >>> >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>> [na:1.8.0_101] >>> >>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >>> [na:1.8.0_101] >>> >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] >>> >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] >>> >>> at >>>
Re: nifi is running out of memory
>From the screenshot and the error message, I interpret the sequence of events to be something like this: 1.) ListS3 succeeds and generates flowfiles with attributes referencing S3 objects, but no content (0 bytes) 2.) FetchS3Object fails to pull the S3 object content with an Access Denied error, but the failed flowfiles are routed on to PutS3Object (35,179 files / 0 bytes in the "putconnector" queue) 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3 I recommend a couple thing for FetchS3Object: * Only allow the "success" relationship to continue to PutS3Object. Separate the "failure" relationship to either loop back to FetchS3Object or go to a LogAttibute processor, or other handling path. * It looks like the permissions aren't working, you might want to double-check the access keys or try a sample file with the AWS CLI. Thanks, James On Fri, Oct 28, 2016 at 10:01 AM, Gop Krrwrote: > This is how my nifi flow looks like. > > On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr wrote: > >> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching >> to 0.71. Now it is able to list the files from buckets and create those >> files in the another bucket. But write is not happening and I am getting >> the permission issue ( I have attached below for the reference) Could this >> be the setting of the buckets or it has more to do with the access key. All >> the files which are creaetd in the new bucket are of 0 byte. >> Thanks >> Rai >> >> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] >> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed >> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy >> yyy,claim=,offset=0,name=x.gz,size=0]; routing to failure: >> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied >> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request >> ID: xxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r >> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4= >> >> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] >> o.a.nifi.processors.aws.s3.FetchS3Object >> >> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied >> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request >> ID: 0F34E71C0697B1D8) >> >> at >> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) >> ~[aws-java-sdk-core-1.10.32.jar:na] >> >> at >> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) >> ~[aws-java-sdk-core-1.10.32.jar:na] >> >> at >> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) >> ~[aws-java-sdk-core-1.10.32.jar:na] >> >> at >> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) >> ~[aws-java-sdk-core-1.10.32.jar:na] >> >> at >> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) >> ~[aws-java-sdk-s3-1.10.32.jar:na] >> >> at >> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) >> ~[aws-java-sdk-s3-1.10.32.jar:na] >> >> at >> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) >> ~[nifi-aws-processors-0.7.1.jar:0.7.1] >> >> at >> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) >> [nifi-api-0.7.1.jar:0.7.1] >> >> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S >> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1] >> >> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask >> .call(ContinuallyRunProcessorTask.java:136) >> [nifi-framework-core-0.7.1.jar:0.7.1] >> >> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask >> .call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar >> :0.7.1] >> >> at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA >> gent$1.run(TimerDrivenSchedulingAgent.java:127) >> [nifi-framework-core-0.7.1.jar:0.7.1] >> >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> [na:1.8.0_101] >> >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) >> [na:1.8.0_101] >> >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] >> >> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu >> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> [na:1.8.0_101] >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> [na:1.8.0_101] >> >> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] >> >> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard < >>
Re: nifi is running out of memory
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to 0.71. Now it is able to list the files from buckets and create those files in the another bucket. But write is not happening and I am getting the permission issue ( I have attached below for the reference) Could this be the setting of the buckets or it has more to do with the access key. All the files which are creaetd in the new bucket are of 0 byte. Thanks Rai 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=y,claim=,offset=0,name=x.gz,size=0]; routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: xxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4= 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.aws.s3.FetchS3Object com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 0F34E71C0697B1D8) at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na] at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803) ~[aws-java-sdk-core-1.10.32.jar:na] at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505) ~[aws-java-sdk-core-1.10.32.jar:na] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317) ~[aws-java-sdk-core-1.10.32.jar:na] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595) ~[aws-java-sdk-s3-1.10.32.jar:na] at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116) ~[aws-java-sdk-s3-1.10.32.jar:na] at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.1.jar:0.7.1] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.jar:0.7.1] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar:0.7.1] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.1.jar:0.7.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villardwrote: > Quick remark: the fix has also been merged in master and will be in > release 1.1.0. > > Pierre > > 2016-10-28 15:22 GMT+02:00 Gop Krr : > >> Thanks Adam. I will try 0.7.1 and update the community on the outcome. If >> it works then I can create a patch for 1.x >> Thanks >> Rai >> >> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar wrote: >> >>> Hey All, >>> >>> I believe OP is running into a bug fixed here: >>> https://issues.apache.org/jira/browse/NIFI-2631 >>> >>> Basically, ListS3 attempts to commit all the files it finds >>> (potentially 100k+) at once, rather than in batches. NIFI-2631 >>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in >>> a 1.x release. >>> >>> Cheers, >>> Adam >>> >>> >>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt wrote: >>> > Looking at this line [1] makes me think the FetchS3 processor is >>> > properly streaming the bytes directly to the content repository. >>> > >>> > Looking at the screenshot showing nothing out of the ListS3 processor >>> > makes me think the bucket has so many things in it that the processor >>> > or associated library isn't handling that well and is just listing >>> > everything with no mechanism of max buffer size. Krish please try >>> > with the largest heap you can and let us
Re: nifi is running out of memory
Quick remark: the fix has also been merged in master and will be in release 1.1.0. Pierre 2016-10-28 15:22 GMT+02:00 Gop Krr: > Thanks Adam. I will try 0.7.1 and update the community on the outcome. If > it works then I can create a patch for 1.x > Thanks > Rai > > On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar wrote: > >> Hey All, >> >> I believe OP is running into a bug fixed here: >> https://issues.apache.org/jira/browse/NIFI-2631 >> >> Basically, ListS3 attempts to commit all the files it finds >> (potentially 100k+) at once, rather than in batches. NIFI-2631 >> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in >> a 1.x release. >> >> Cheers, >> Adam >> >> >> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt wrote: >> > Looking at this line [1] makes me think the FetchS3 processor is >> > properly streaming the bytes directly to the content repository. >> > >> > Looking at the screenshot showing nothing out of the ListS3 processor >> > makes me think the bucket has so many things in it that the processor >> > or associated library isn't handling that well and is just listing >> > everything with no mechanism of max buffer size. Krish please try >> > with the largest heap you can and let us know what you see. >> > >> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/ >> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/ >> apache/nifi/processors/aws/s3/FetchS3Object.java#L107 >> > >> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt wrote: >> >> moving dev to bcc >> >> >> >> Yes I believe the issue here is that FetchS3 doesn't do chunked >> >> transfers and so is loading all into memory. I've not verified this >> >> in the code yet but it seems quite likely. Krish if you can verify >> >> that going with a larger heap gets you in the game can you please file >> >> a JIRA. >> >> >> >> Thanks >> >> Joe >> >> >> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende wrote: >> >>> Hello, >> >>> >> >>> Are you running with all of the default settings? >> >>> >> >>> If so you would probably want to try increasing the memory settings in >> >>> conf/bootstrap.conf. >> >>> >> >>> They default to 512mb, you may want to try bumping it up to 1024mb. >> >>> >> >>> -Bryan >> >>> >> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr wrote: >> >> Hi All, >> >> I have very simple data flow, where I need to move s3 data from one >> bucket >> in one account to another bucket under another account. I have >> attached my >> processor configuration. >> >> >> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2] >> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow >> Service >> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space >> >> I am very new to NiFi and trying ot get few of the use cases going. >> I need >> help from the community. >> >> Thanks again >> >> Rai >> >> >> >> >>> >> > >
Re: nifi is running out of memory
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If it works then I can create a patch for 1.x Thanks Rai On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamarwrote: > Hey All, > > I believe OP is running into a bug fixed here: > https://issues.apache.org/jira/browse/NIFI-2631 > > Basically, ListS3 attempts to commit all the files it finds > (potentially 100k+) at once, rather than in batches. NIFI-2631 > addresses the issue. Looks like the fix is out in 0.7.1 but not yet in > a 1.x release. > > Cheers, > Adam > > > On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt wrote: > > Looking at this line [1] makes me think the FetchS3 processor is > > properly streaming the bytes directly to the content repository. > > > > Looking at the screenshot showing nothing out of the ListS3 processor > > makes me think the bucket has so many things in it that the processor > > or associated library isn't handling that well and is just listing > > everything with no mechanism of max buffer size. Krish please try > > with the largest heap you can and let us know what you see. > > > > [1] https://github.com/apache/nifi/blob/master/nifi-nar- > bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/ > org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107 > > > > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt wrote: > >> moving dev to bcc > >> > >> Yes I believe the issue here is that FetchS3 doesn't do chunked > >> transfers and so is loading all into memory. I've not verified this > >> in the code yet but it seems quite likely. Krish if you can verify > >> that going with a larger heap gets you in the game can you please file > >> a JIRA. > >> > >> Thanks > >> Joe > >> > >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende wrote: > >>> Hello, > >>> > >>> Are you running with all of the default settings? > >>> > >>> If so you would probably want to try increasing the memory settings in > >>> conf/bootstrap.conf. > >>> > >>> They default to 512mb, you may want to try bumping it up to 1024mb. > >>> > >>> -Bryan > >>> > >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr wrote: > > Hi All, > > I have very simple data flow, where I need to move s3 data from one > bucket > in one account to another bucket under another account. I have > attached my > processor configuration. > > > 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2] > org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow > Service > Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space > > I am very new to NiFi and trying ot get few of the use cases going. I > need > help from the community. > > Thanks again > > Rai > > > > >>> >
Re: How I put the cluster down.
Hello Witt, before anything else thanks for your help. Fortunatly I put down only the NIFI cluster, otherwise I was already in vacation :) After I posted this problem I kept to torture staging NIFI and discovered that when CPU LOAD gets very high, nodes loose connection and anything starts going in the bad directory. Also the WEB GUI becomes not responsive, you have no option to stop workflows. You can reproduce this issue starting some workflows composed by 1) GenerateFlowFile ( 1 Kb size, Timer driven, 0 sec run schedule ) 2) ReplaceText ( just to force the use of regexp ) 3) HashContent, ( auto terminate both relationships ) Currently my staging cluster is composed by 2 virtual host configured as: 2 Core cpu ( Intel(R) Xeon(R) CPU E7- 2870 @ 2.40GHz ) 2 GB RAM 18 GB HD The problem raised when the CPU load goes over 8, this basically means when you start 8 of the above WF. I noticed NIFI attempts to reduce the load but this does not works too much and does not avoid the general failure. Here you can see the errors which started to show under stress: https://drive.google.com/drive/folders/0B7NTMIqrCjESN0JURnRtZWp5Tms?usp=sharing The 1st question is: is here a way to keep the load under some critical values? Is there some "how to" which helps me to configure NIFI ? Currently it is using the factory settings and no customization has been performed but LDAP login. AP On 28/10/2016 13:24, Joe Witt wrote: > Alessio > > You have two clusters here potentially. The NiFi cluster and the > Hadoop cluster. Which one went down? > > If NiFi went down I'd suspect memory exhaustion issues because other > resource exhaustion issues like full file system, exhausted file > handles, pegged CPU, etc.. tend not to cause it to restart. If memory > related you'll probably see something in the nifi-app.log. Try going > with a larger heap as can be controlled in conf/bootstrap.conf. > > Thanks > Joe > > On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palma >wrote: >> Hello all, >> yesterday, for a mistake, basically I executed " ls -R / " using the >> ListHDFS processor and the whole cluster gone down ( not just a node ). >> >> Something like this also happened when I was playing with some DO WHILE >> / WHILE DO patterns. I have only the nifi logs and they show the >> heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no >> info. Any pointers about where do I have look for the problem's root ? >> >> Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad >> is happening although CPU LOAD is enough high and NETWORK TRAFFIC >> increased up to 282 Kb/sec. >> >> Of course I can redo the "ls -R /" on production, however I like to >> avoid it since there are already some ingestion flows running. >> >> AP > . >
Re: How I put the cluster down.
Alessio You have two clusters here potentially. The NiFi cluster and the Hadoop cluster. Which one went down? If NiFi went down I'd suspect memory exhaustion issues because other resource exhaustion issues like full file system, exhausted file handles, pegged CPU, etc.. tend not to cause it to restart. If memory related you'll probably see something in the nifi-app.log. Try going with a larger heap as can be controlled in conf/bootstrap.conf. Thanks Joe On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palmawrote: > Hello all, > yesterday, for a mistake, basically I executed " ls -R / " using the > ListHDFS processor and the whole cluster gone down ( not just a node ). > > Something like this also happened when I was playing with some DO WHILE > / WHILE DO patterns. I have only the nifi logs and they show the > heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no > info. Any pointers about where do I have look for the problem's root ? > > Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad > is happening although CPU LOAD is enough high and NETWORK TRAFFIC > increased up to 282 Kb/sec. > > Of course I can redo the "ls -R /" on production, however I like to > avoid it since there are already some ingestion flows running. > > AP
How I put the cluster down.
Hello all, yesterday, for a mistake, basically I executed " ls -R / " using the ListHDFS processor and the whole cluster gone down ( not just a node ). Something like this also happened when I was playing with some DO WHILE / WHILE DO patterns. I have only the nifi logs and they show the heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no info. Any pointers about where do I have look for the problem's root ? Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad is happening although CPU LOAD is enough high and NETWORK TRAFFIC increased up to 282 Kb/sec. Of course I can redo the "ls -R /" on production, however I like to avoid it since there are already some ingestion flows running. AP