Re: How I put the cluster down.

2016-10-28 Thread Andrew Grande
Hi,

I'd suggest couple things. Have you configured backpressure controls on
connections? NiFi 1.0.0 adds 1evt/1GB by default IIRC. This can help
avoid overwhelming components in a flow.

Next, the 2 core CPU is really inadequate for high throughput system, see
if you can get something better. It seems there's a lot going on in your
cluster. A full NiFi node with many flows does a lot of housekeeping in the
background, needs some power.

Andrew

On Fri, Oct 28, 2016, 8:36 AM Alessio Palma 
wrote:

> Hello Witt,
> before anything else thanks for your help.
> Fortunatly I  put down only the NIFI cluster, otherwise I was already in
> vacation :)
>
> After I posted this problem I kept to torture staging NIFI and
> discovered that when CPU LOAD gets very high, nodes loose connection and
> anything starts going in the bad directory. Also the WEB GUI becomes not
> responsive, you have no option to stop workflows.
>
> You can reproduce this issue starting some workflows composed by
> 1) GenerateFlowFile ( 1 Kb size, Timer driven, 0 sec run schedule )
> 2) ReplaceText ( just to force the use of regexp )
> 3) HashContent, ( auto terminate both relationships )
>
> Currently my staging cluster is composed by 2 virtual host configured as:
> 2 Core cpu ( Intel(R) Xeon(R) CPU E7- 2870  @ 2.40GHz )
> 2 GB RAM
> 18 GB HD
>
> The problem raised when the CPU load goes over 8, this basically means
> when you start 8 of the above WF.
>
> I noticed NIFI attempts to reduce the load but this does not works too
> much and does not avoid the general failure.
>
> Here you can see the errors which started to show under stress:
>
> https://drive.google.com/drive/folders/0B7NTMIqrCjESN0JURnRtZWp5Tms?usp=sharing
>
>
> The 1st question is: is here a way to keep the load under some critical
> values? Is there some "how to" which helps me to configure NIFI ?
> Currently it is using the factory settings and no customization has been
> performed but LDAP login.
>
> AP
>
>
>
> On 28/10/2016 13:24, Joe Witt wrote:
> > Alessio
> >
> > You have two clusters here potentially.  The NiFi cluster and the
> > Hadoop cluster.  Which one went down?
> >
> > If NiFi went down I'd suspect memory exhaustion issues because other
> > resource exhaustion issues like full file system, exhausted file
> > handles, pegged CPU, etc.. tend not to cause it to restart.  If memory
> > related you'll probably see something in the nifi-app.log.  Try going
> > with a larger heap as can be controlled in conf/bootstrap.conf.
> >
> > Thanks
> > Joe
> >
> > On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palma
> >  wrote:
> >> Hello all,
> >> yesterday, for a mistake, basically I executed " ls -R / " using the
> >> ListHDFS processor and the whole cluster gone down ( not just a node ).
> >>
> >> Something like this also happened when I was playing with some DO WHILE
> >> / WHILE DO patterns. I have only the nifi logs and they show the
> >> heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no
> >> info. Any pointers about where do I have look for the problem's root ?
> >>
> >> Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad
> >> is happening although CPU LOAD is enough high and NETWORK  TRAFFIC
> >> increased up to 282 Kb/sec.
> >>
> >> Of course I can redo the "ls -R /" on production, however I like to
> >> avoid it since there are already some ingestion flows running.
> >>
> >> AP
> > .
> >
>


Re: nifi is running out of memory

2016-10-28 Thread Gop Krr
Thanks James.. I am looking into permission issue and update the thread. I
will also make the changes as you per your recommendation.

On Fri, Oct 28, 2016 at 10:23 AM, James Wing  wrote:

> From the screenshot and the error message, I interpret the sequence of
> events to be something like this:
>
> 1.) ListS3 succeeds and generates flowfiles with attributes referencing S3
> objects, but no content (0 bytes)
> 2.) FetchS3Object fails to pull the S3 object content with an Access
> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
> files / 0 bytes in the "putconnector" queue)
> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>
> I recommend a couple thing for FetchS3Object:
>
> * Only allow the "success" relationship to continue to PutS3Object.
> Separate the "failure" relationship to either loop back to FetchS3Object or
> go to a LogAttibute processor, or other handling path.
> * It looks like the permissions aren't working, you might want to
> double-check the access keys or try a sample file with the AWS CLI.
>
> Thanks,
>
> James
>
>
> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr  wrote:
>
>> This is how my nifi flow looks like.
>>
>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr  wrote:
>>
>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>> to 0.71.  Now it is able to list the files from buckets and create those
>>> files in the another bucket. But write is not happening and I am getting
>>> the permission issue ( I have attached below for the reference) Could this
>>> be the setting of the buckets or it has more to do with the access key. All
>>> the files which are creaetd in the new bucket are of 0 byte.
>>> Thanks
>>> Rai
>>>
>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed
>>> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy
>>> yyy,claim=,offset=0,name=x.gz,size=0]; routing to failure:
>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>> Request ID: xxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r
>>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>
>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>
>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>> Request ID: 0F34E71C0697B1D8)
>>>
>>> at 
>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>> at 
>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>> at 
>>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>> at 
>>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>> at 
>>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>
>>> at 
>>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>
>>> at 
>>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>
>>> at 
>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>> [nifi-api-0.7.1.jar:0.7.1]
>>>
>>> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>> .call(ContinuallyRunProcessorTask.java:136)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>> .call(ContinuallyRunProcessorTask.java:47)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>> at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>>> gent$1.run(TimerDrivenSchedulingAgent.java:127)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>> at 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> [na:1.8.0_101]
>>>
>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>> [na:1.8.0_101]
>>>
>>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
>>>
>>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
>>>
>>> at 
>>> 

Re: nifi is running out of memory

2016-10-28 Thread James Wing
>From the screenshot and the error message, I interpret the sequence of
events to be something like this:

1.) ListS3 succeeds and generates flowfiles with attributes referencing S3
objects, but no content (0 bytes)
2.) FetchS3Object fails to pull the S3 object content with an Access Denied
error, but the failed flowfiles are routed on to PutS3Object (35,179 files
/ 0 bytes in the "putconnector" queue)
3.) PutS3Object is succeeding, writing the 0 byte content from ListS3

I recommend a couple thing for FetchS3Object:

* Only allow the "success" relationship to continue to PutS3Object.
Separate the "failure" relationship to either loop back to FetchS3Object or
go to a LogAttibute processor, or other handling path.
* It looks like the permissions aren't working, you might want to
double-check the access keys or try a sample file with the AWS CLI.

Thanks,

James


On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr  wrote:

> This is how my nifi flow looks like.
>
> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr  wrote:
>
>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>> to 0.71.  Now it is able to list the files from buckets and create those
>> files in the another bucket. But write is not happening and I am getting
>> the permission issue ( I have attached below for the reference) Could this
>> be the setting of the buckets or it has more to do with the access key. All
>> the files which are creaetd in the new bucket are of 0 byte.
>> Thanks
>> Rai
>>
>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed
>> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy
>> yyy,claim=,offset=0,name=x.gz,size=0]; routing to failure:
>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
>> ID: xxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r
>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>
>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.aws.s3.FetchS3Object
>>
>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
>> ID: 0F34E71C0697B1D8)
>>
>> at 
>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>> at 
>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>> at 
>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>> at 
>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>> at 
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>
>> at 
>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>
>> at 
>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>
>> at 
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>> [nifi-api-0.7.1.jar:0.7.1]
>>
>> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:136)
>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar
>> :0.7.1]
>>
>> at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>> gent$1.run(TimerDrivenSchedulingAgent.java:127)
>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>> at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_101]
>>
>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> [na:1.8.0_101]
>>
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
>>
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
>>
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_101]
>>
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_101]
>>
>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>
>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <
>> 

Re: nifi is running out of memory

2016-10-28 Thread Gop Krr
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to
0.71.  Now it is able to list the files from buckets and create those files
in the another bucket. But write is not happening and I am getting the
permission issue ( I have attached below for the reference) Could this be
the setting of the buckets or it has more to do with the access key. All
the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=x] Failed to
retrieve S3 Object for
StandardFlowFileRecord[uuid=y,claim=,offset=0,name=x.gz,size=0];
routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception:
Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
AccessDenied; Request ID: xxx), S3 Extended Request ID:
lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service:
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID:
0F34E71C0697B1D8)

at 
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
~[aws-java-sdk-core-1.10.32.jar:na]

at
com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
~[aws-java-sdk-core-1.10.32.jar:na]

at
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
~[aws-java-sdk-core-1.10.32.jar:na]

at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
~[aws-java-sdk-core-1.10.32.jar:na]

at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
~[aws-java-sdk-s3-1.10.32.jar:na]

at
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
~[aws-java-sdk-s3-1.10.32.jar:na]

at
org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
~[nifi-aws-processors-0.7.1.jar:0.7.1]

at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
[nifi-api-0.7.1.jar:0.7.1]

at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
[nifi-framework-core-0.7.1.jar:0.7.1]

at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
[nifi-framework-core-0.7.1.jar:0.7.1]

at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
[nifi-framework-core-0.7.1.jar:0.7.1]

at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
[nifi-framework-core-0.7.1.jar:0.7.1]

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_101]

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[na:1.8.0_101]

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_101]

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[na:1.8.0_101]

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_101]

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_101]

at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard  wrote:

> Quick remark: the fix has also been merged in master and will be in
> release 1.1.0.
>
> Pierre
>
> 2016-10-28 15:22 GMT+02:00 Gop Krr :
>
>> Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
>> it works then I can create a patch for 1.x
>> Thanks
>> Rai
>>
>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar  wrote:
>>
>>> Hey All,
>>>
>>> I believe OP is running into a bug fixed here:
>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>
>>> Basically, ListS3 attempts to commit all the files it finds
>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>>> a 1.x release.
>>>
>>> Cheers,
>>> Adam
>>>
>>>
>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt  wrote:
>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>> > properly streaming the bytes directly to the content repository.
>>> >
>>> > Looking at the screenshot showing nothing out of the ListS3 processor
>>> > makes me think the bucket has so many things in it that the processor
>>> > or associated library isn't handling that well and is just listing
>>> > everything with no mechanism of max buffer size.  Krish please try
>>> > with the largest heap you can and let us 

Re: nifi is running out of memory

2016-10-28 Thread Pierre Villard
Quick remark: the fix has also been merged in master and will be in release
1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr :

> Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
> it works then I can create a patch for 1.x
> Thanks
> Rai
>
> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar  wrote:
>
>> Hey All,
>>
>> I believe OP is running into a bug fixed here:
>> https://issues.apache.org/jira/browse/NIFI-2631
>>
>> Basically, ListS3 attempts to commit all the files it finds
>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>> a 1.x release.
>>
>> Cheers,
>> Adam
>>
>>
>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt  wrote:
>> > Looking at this line [1] makes me think the FetchS3 processor is
>> > properly streaming the bytes directly to the content repository.
>> >
>> > Looking at the screenshot showing nothing out of the ListS3 processor
>> > makes me think the bucket has so many things in it that the processor
>> > or associated library isn't handling that well and is just listing
>> > everything with no mechanism of max buffer size.  Krish please try
>> > with the largest heap you can and let us know what you see.
>> >
>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/
>> apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>> >
>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt  wrote:
>> >> moving dev to bcc
>> >>
>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> >> transfers and so is loading all into memory.  I've not verified this
>> >> in the code yet but it seems quite likely.  Krish if you can verify
>> >> that going with a larger heap gets you in the game can you please file
>> >> a JIRA.
>> >>
>> >> Thanks
>> >> Joe
>> >>
>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende  wrote:
>> >>> Hello,
>> >>>
>> >>> Are you running with all of the default settings?
>> >>>
>> >>> If so you would probably want to try increasing the memory settings in
>> >>> conf/bootstrap.conf.
>> >>>
>> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
>> >>>
>> >>> -Bryan
>> >>>
>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr  wrote:
>> 
>>  Hi All,
>> 
>>  I have very simple data flow, where I need to move s3 data from one
>> bucket
>>  in one account to another bucket under another account. I have
>> attached my
>>  processor configuration.
>> 
>> 
>>  2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>  org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
>> Service
>>  Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>> 
>>  I am very new to NiFi and trying ot get few of the use cases going.
>> I need
>>  help from the community.
>> 
>>  Thanks again
>> 
>>  Rai
>> 
>> 
>> 
>> >>>
>>
>
>


Re: nifi is running out of memory

2016-10-28 Thread Gop Krr
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar  wrote:

> Hey All,
>
> I believe OP is running into a bug fixed here:
> https://issues.apache.org/jira/browse/NIFI-2631
>
> Basically, ListS3 attempts to commit all the files it finds
> (potentially 100k+) at once, rather than in batches. NIFI-2631
> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
> a 1.x release.
>
> Cheers,
> Adam
>
>
> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt  wrote:
> > Looking at this line [1] makes me think the FetchS3 processor is
> > properly streaming the bytes directly to the content repository.
> >
> > Looking at the screenshot showing nothing out of the ListS3 processor
> > makes me think the bucket has so many things in it that the processor
> > or associated library isn't handling that well and is just listing
> > everything with no mechanism of max buffer size.  Krish please try
> > with the largest heap you can and let us know what you see.
> >
> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/
> org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
> >
> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt  wrote:
> >> moving dev to bcc
> >>
> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
> >> transfers and so is loading all into memory.  I've not verified this
> >> in the code yet but it seems quite likely.  Krish if you can verify
> >> that going with a larger heap gets you in the game can you please file
> >> a JIRA.
> >>
> >> Thanks
> >> Joe
> >>
> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende  wrote:
> >>> Hello,
> >>>
> >>> Are you running with all of the default settings?
> >>>
> >>> If so you would probably want to try increasing the memory settings in
> >>> conf/bootstrap.conf.
> >>>
> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
> >>>
> >>> -Bryan
> >>>
> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr  wrote:
> 
>  Hi All,
> 
>  I have very simple data flow, where I need to move s3 data from one
> bucket
>  in one account to another bucket under another account. I have
> attached my
>  processor configuration.
> 
> 
>  2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>  org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
> Service
>  Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
> 
>  I am very new to NiFi and trying ot get few of the use cases going. I
> need
>  help from the community.
> 
>  Thanks again
> 
>  Rai
> 
> 
> 
> >>>
>


Re: How I put the cluster down.

2016-10-28 Thread Alessio Palma
Hello Witt,
before anything else thanks for your help.
Fortunatly I  put down only the NIFI cluster, otherwise I was already in
vacation :)

After I posted this problem I kept to torture staging NIFI and
discovered that when CPU LOAD gets very high, nodes loose connection and
anything starts going in the bad directory. Also the WEB GUI becomes not
responsive, you have no option to stop workflows.

You can reproduce this issue starting some workflows composed by
1) GenerateFlowFile ( 1 Kb size, Timer driven, 0 sec run schedule )
2) ReplaceText ( just to force the use of regexp )
3) HashContent, ( auto terminate both relationships )

Currently my staging cluster is composed by 2 virtual host configured as:
2 Core cpu ( Intel(R) Xeon(R) CPU E7- 2870  @ 2.40GHz )
2 GB RAM
18 GB HD

The problem raised when the CPU load goes over 8, this basically means
when you start 8 of the above WF.

I noticed NIFI attempts to reduce the load but this does not works too
much and does not avoid the general failure.

Here you can see the errors which started to show under stress:
https://drive.google.com/drive/folders/0B7NTMIqrCjESN0JURnRtZWp5Tms?usp=sharing


The 1st question is: is here a way to keep the load under some critical
values? Is there some "how to" which helps me to configure NIFI ?
Currently it is using the factory settings and no customization has been
performed but LDAP login.

AP



On 28/10/2016 13:24, Joe Witt wrote:
> Alessio
> 
> You have two clusters here potentially.  The NiFi cluster and the
> Hadoop cluster.  Which one went down?
> 
> If NiFi went down I'd suspect memory exhaustion issues because other
> resource exhaustion issues like full file system, exhausted file
> handles, pegged CPU, etc.. tend not to cause it to restart.  If memory
> related you'll probably see something in the nifi-app.log.  Try going
> with a larger heap as can be controlled in conf/bootstrap.conf.
> 
> Thanks
> Joe
> 
> On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palma
>  wrote:
>> Hello all,
>> yesterday, for a mistake, basically I executed " ls -R / " using the
>> ListHDFS processor and the whole cluster gone down ( not just a node ).
>>
>> Something like this also happened when I was playing with some DO WHILE
>> / WHILE DO patterns. I have only the nifi logs and they show the
>> heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no
>> info. Any pointers about where do I have look for the problem's root ?
>>
>> Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad
>> is happening although CPU LOAD is enough high and NETWORK  TRAFFIC
>> increased up to 282 Kb/sec.
>>
>> Of course I can redo the "ls -R /" on production, however I like to
>> avoid it since there are already some ingestion flows running.
>>
>> AP
> .
> 


Re: How I put the cluster down.

2016-10-28 Thread Joe Witt
Alessio

You have two clusters here potentially.  The NiFi cluster and the
Hadoop cluster.  Which one went down?

If NiFi went down I'd suspect memory exhaustion issues because other
resource exhaustion issues like full file system, exhausted file
handles, pegged CPU, etc.. tend not to cause it to restart.  If memory
related you'll probably see something in the nifi-app.log.  Try going
with a larger heap as can be controlled in conf/bootstrap.conf.

Thanks
Joe

On Fri, Oct 28, 2016 at 5:55 AM, Alessio Palma
 wrote:
> Hello all,
> yesterday, for a mistake, basically I executed " ls -R / " using the
> ListHDFS processor and the whole cluster gone down ( not just a node ).
>
> Something like this also happened when I was playing with some DO WHILE
> / WHILE DO patterns. I have only the nifi logs and they show the
> heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no
> info. Any pointers about where do I have look for the problem's root ?
>
> Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad
> is happening although CPU LOAD is enough high and NETWORK  TRAFFIC
> increased up to 282 Kb/sec.
>
> Of course I can redo the "ls -R /" on production, however I like to
> avoid it since there are already some ingestion flows running.
>
> AP


How I put the cluster down.

2016-10-28 Thread Alessio Palma
Hello all,
yesterday, for a mistake, basically I executed " ls -R / " using the
ListHDFS processor and the whole cluster gone down ( not just a node ).

Something like this also happened when I was playing with some DO WHILE
/ WHILE DO patterns. I have only the nifi logs and they show the
heartbeat has been lost. About the CPU LOAD, NETWORK TRAFFIC I have no
info. Any pointers about where do I have look for the problem's root ?

Today I'm trying to repeat the problems I got with DO/WHILE, nothing bad
is happening although CPU LOAD is enough high and NETWORK  TRAFFIC
increased up to 282 Kb/sec.

Of course I can redo the "ls -R /" on production, however I like to
avoid it since there are already some ingestion flows running.

AP