Absolutely. Thank you for looking into this Aldrin.

I do indeed have NiFi configured as a service. I've stopped an started it
dozens of times through the life of my workflow development these recent
months. It's always previously started up like a champ. On this particular
occasion I did this:
service nifi stop
as user nifi. It shutdown, and the logs presented no errors.
I then did this:
service nifi start
as user nifi. The bootstrap log contained the INFO messages I shared with
you above.

Our data flow has not taxed NiFi much at all. There was no data processing
through at the time. We had recently done two bulk ingests of large data
directories. The content repo had indicated 46% full, but after I let it
sit overnight it had dropped back down to a typical level of 3-6%. As I
learned yesterday, with my archive retention set to 12 hours it explained
why I was seeing the content repo  hold on to all that capacity after all
my 100,000 files had processed through late yesterday.

Early this morning I modified my conf/nifi.properties to drop my archive
retention to 1 day from 12 days. This was when I tried and failed to
restart.

We've since rebooted the host and NiFi came right up. With my new archive
retention value in place, I tried processing about 16,000 files through.
They flew through, but I have noticed a Warning that I believe is caused by
my change to archive retention:   WARNING The rate of the dataflow is
exceeding the provenance recording rate. Slowing down flow to accommodate.

What else can I tell you? I suppose it would help to mention that my three
major repos - content, flowfile, provenance - are on separate local disk
devices.

My workflow load peaks when I try to process approximately 100,000 files
totaling 50 GB through the flow. The content repo maxes out at 46% of our
50GB capacity. The provenance and flowfile repos never peak into the double
digits. I do some custom parsing and custom logging in
InvokeScriptedProcessors. I employ HandleHttpResponse and
HandleHttpRequests processors.

I've not yet watched memory usage on the box as I run, but I'll try to use
a 'watch -n [#] free -m'  later to see what happens. My nifi instance runs
with JVM memory parms in bootstrap.conf of -Xms4096m and -Xmx8192m.

Jim

On Thu, May 25, 2017 at 10:38 AM, Aldrin Piri <[email protected]> wrote:

> If you happen to remember, could you get more specific into your sequence
> of operations?  Is nifi installed as a service? If so, was it restarted
> Did you just issue a nifi.sh restart?
>
> Do you have any CM tooling (Puppet, Chef, Salt, etc) that is managing this
> process/system?
>
> Could you tell us what the bootstrap log says prior to those lines in
> terms of shutting down?
>
> Would you be able to describe the load exerted on the system by the flow?
> A bit of an amorphous question, but is/was the system heavily taxed running
> NiFi?
>
> The section you hit _should_ only be hit if NiFi (the flow process and not
> the bootstrap) terminates for some reason (e.g. - Hit an out of memory
> case).  I have a few notions as to how the right confluence of events could
> have gotten you otherwise, so any additional details would be great to vet
> their possible culpability.
>
> Thanks!
>
> On Thu, May 25, 2017 at 10:10 AM, James McMahon <[email protected]>
> wrote:
>
>> I did inspect the log more closely. It offers little additional insight.
>> Here is what it says (unable to export, had to transcribe myself):
>>
>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi Status
>> File no longer exists. Will not restart NiFi
>> [date] [time],### INFO [main] o.a.n.b.NotificationServiceManager
>> Successfully loaded the following 0 services: [ ]
>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi
>> Registered no Notification Services for Notification Type NIFI_STARTED
>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi
>> Registered no Notification Services for Notification Type NIFI_STOPPED
>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi
>> Registered no Notification Services for Notification Type NIFI_DIED
>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.Command Apache
>> NiFi is not running
>>
>> My hope is that we can figure out what happens to this status file, and
>> how I can prevent it from nonexistence.
>>
>> Jim
>>
>> On Thu, May 25, 2017 at 9:37 AM, Joe Witt <[email protected]> wrote:
>>
>>> I don't think rebooting the system had anything to do with NiFi's
>>> ability to startup.  But i'm not sure I understand that particular
>>> part of logic in the code in terms of the case it was defending
>>> against.
>>>
>>> On Thu, May 25, 2017 at 9:34 AM, James McMahon <[email protected]>
>>> wrote:
>>> > Will do Joe. I'll dig for that now.
>>> >
>>> > Infrastructure Group did reboot the box, which had been up and running
>>> for
>>> > nearly two months. NiFi did indeed come up following the reboot. I
>>> still
>>> > want to try and get you this log information so that I can learn what
>>> > triggers such a situation, and whether there is a more refined way to
>>> solve
>>> > it than full system reboot. There are other things running on the
>>> resource
>>> > and I should try to minimize impact to them by fully rebooting.
>>> >
>>> > Let me see about that log content. Thank you again.
>>> >
>>> > On Thu, May 25, 2017 at 9:25 AM, Joe Witt <[email protected]> wrote:
>>> >>
>>> >> Jim,
>>> >>
>>> >> The code relevant to that log output is here [1].  Can you share the
>>> >> bootstrap output before/after that output?
>>> >>
>>> >> [1]
>>> >> https://github.com/apache/nifi/blob/rel/nifi-0.7.1/nifi-boot
>>> strap/src/main/java/org/apache/nifi/bootstrap/RunNiFi.java
>>> >>
>>> >> Thanks
>>> >> Joe
>>> >>
>>> >> On Thu, May 25, 2017 at 9:11 AM, James McMahon <[email protected]>
>>> >> wrote:
>>> >> > Am running NiFi 0.7.x. Have been running with great stability for a
>>> long
>>> >> > period of time. Tried this morning to make this change in my
>>> >> > nifi.properties
>>> >> > conf file:
>>> >> >
>>> >> > nifi.content.repository.archive.max.retention.period=1 hour
>>> >> >
>>> >> > Reduced from the default of 12 hours. Relatively simple change,
>>> requires
>>> >> > a
>>> >> > nifi restart to take effect.
>>> >> >
>>> >> > My restart attempt throws no errors to the nifi app log, but in the
>>> >> > bootstrap log I do see this:
>>> >> > org.apache.nifi.bootstrap.RunNiFi Status file no longer exists.
>>> Will not
>>> >> > restart NiFi
>>> >> >
>>> >> > I've done some digging and all I could find is rebooting the box in
>>> >> > hopes of
>>> >> > resolving. Am reaching out to the infrastructure group that owns the
>>> >> > server
>>> >> > now, asking them to do so. Would like to also in parallel
>>> understand why
>>> >> > this happened, and where, exactly, this status file should be?
>>> >> >
>>> >> > Can I resolve this by manually recreating such a status file with
>>> >> > certain
>>> >> > permissions and ownership?
>>> >> >
>>> >> > Thanks in advance for your help.  -Jim
>>> >> >
>>> >> >
>>> >
>>> >
>>>
>>
>>
>

Reply via email to