i would increase index threads from default of 2 (i think) to a bit larger. You can also tune the other properties like shard size and the like. They should be described in the admin docs.
On Thu, May 25, 2017 at 11:12 AM, James McMahon <[email protected]> wrote: > Thank you Joe. Do you advise, then, that we tune some parameters now or is > it acceptable to allow NiFi to ..... self-regulate .... as it appears to be > doing? If you suggest tuning, which ones should I look at - index.threads? I > notice that at present I have that set to a robust 1. > > That improvement with 1.2.0 sounds like it will make a big difference. Sadly > as you recall it may be some time before 1.2.x is available to me. > > On Thu, May 25, 2017 at 11:05 AM, Joe Witt <[email protected]> wrote: >> >> jim >> >> that provenance warning is not related to archive/retention. It is >> provenance telling you it can only index events so fast and at present >> it is falling behind so will slow the flow to ensure things dont get >> too far out of balance. However, there are configuration properties >> that let you give provenance indexing more threads. Also, we created >> a new provenance implementation available in niFi 1.2.0 which is >> multiple times faster with immediate indexing. >> >> Thanks >> >> On Thu, May 25, 2017 at 11:03 AM, James McMahon <[email protected]> >> wrote: >> > Absolutely. Thank you for looking into this Aldrin. >> > >> > I do indeed have NiFi configured as a service. I've stopped an started >> > it >> > dozens of times through the life of my workflow development these recent >> > months. It's always previously started up like a champ. On this >> > particular >> > occasion I did this: >> > service nifi stop >> > as user nifi. It shutdown, and the logs presented no errors. >> > I then did this: >> > service nifi start >> > as user nifi. The bootstrap log contained the INFO messages I shared >> > with >> > you above. >> > >> > Our data flow has not taxed NiFi much at all. There was no data >> > processing >> > through at the time. We had recently done two bulk ingests of large data >> > directories. The content repo had indicated 46% full, but after I let it >> > sit >> > overnight it had dropped back down to a typical level of 3-6%. As I >> > learned >> > yesterday, with my archive retention set to 12 hours it explained why I >> > was >> > seeing the content repo hold on to all that capacity after all my >> > 100,000 >> > files had processed through late yesterday. >> > >> > Early this morning I modified my conf/nifi.properties to drop my archive >> > retention to 1 day from 12 days. This was when I tried and failed to >> > restart. >> > >> > We've since rebooted the host and NiFi came right up. With my new >> > archive >> > retention value in place, I tried processing about 16,000 files through. >> > They flew through, but I have noticed a Warning that I believe is caused >> > by >> > my change to archive retention: WARNING The rate of the dataflow is >> > exceeding the provenance recording rate. Slowing down flow to >> > accommodate. >> > >> > What else can I tell you? I suppose it would help to mention that my >> > three >> > major repos - content, flowfile, provenance - are on separate local disk >> > devices. >> > >> > My workflow load peaks when I try to process approximately 100,000 files >> > totaling 50 GB through the flow. The content repo maxes out at 46% of >> > our >> > 50GB capacity. The provenance and flowfile repos never peak into the >> > double >> > digits. I do some custom parsing and custom logging in >> > InvokeScriptedProcessors. I employ HandleHttpResponse and >> > HandleHttpRequests >> > processors. >> > >> > I've not yet watched memory usage on the box as I run, but I'll try to >> > use a >> > 'watch -n [#] free -m' later to see what happens. My nifi instance runs >> > with JVM memory parms in bootstrap.conf of -Xms4096m and -Xmx8192m. >> > >> > Jim >> > >> > On Thu, May 25, 2017 at 10:38 AM, Aldrin Piri <[email protected]> >> > wrote: >> >> >> >> If you happen to remember, could you get more specific into your >> >> sequence >> >> of operations? Is nifi installed as a service? If so, was it restarted >> >> Did you just issue a nifi.sh restart? >> >> >> >> Do you have any CM tooling (Puppet, Chef, Salt, etc) that is managing >> >> this >> >> process/system? >> >> >> >> Could you tell us what the bootstrap log says prior to those lines in >> >> terms of shutting down? >> >> >> >> Would you be able to describe the load exerted on the system by the >> >> flow? >> >> A bit of an amorphous question, but is/was the system heavily taxed >> >> running >> >> NiFi? >> >> >> >> The section you hit _should_ only be hit if NiFi (the flow process and >> >> not >> >> the bootstrap) terminates for some reason (e.g. - Hit an out of memory >> >> case). I have a few notions as to how the right confluence of events >> >> could >> >> have gotten you otherwise, so any additional details would be great to >> >> vet >> >> their possible culpability. >> >> >> >> Thanks! >> >> >> >> On Thu, May 25, 2017 at 10:10 AM, James McMahon <[email protected]> >> >> wrote: >> >>> >> >>> I did inspect the log more closely. It offers little additional >> >>> insight. >> >>> Here is what it says (unable to export, had to transcribe myself): >> >>> >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi Status >> >>> File no longer exists. Will not restart NiFi >> >>> [date] [time],### INFO [main] o.a.n.b.NotificationServiceManager >> >>> Successfully loaded the following 0 services: [ ] >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi >> >>> Registered no Notification Services for Notification Type NIFI_STARTED >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi >> >>> Registered no Notification Services for Notification Type NIFI_STOPPED >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.RunNiFi >> >>> Registered no Notification Services for Notification Type NIFI_DIED >> >>> [date] [time],### INFO [main] org.apache.nifi.bootstrap.Command Apache >> >>> NiFi is not running >> >>> >> >>> My hope is that we can figure out what happens to this status file, >> >>> and >> >>> how I can prevent it from nonexistence. >> >>> >> >>> Jim >> >>> >> >>> On Thu, May 25, 2017 at 9:37 AM, Joe Witt <[email protected]> wrote: >> >>>> >> >>>> I don't think rebooting the system had anything to do with NiFi's >> >>>> ability to startup. But i'm not sure I understand that particular >> >>>> part of logic in the code in terms of the case it was defending >> >>>> against. >> >>>> >> >>>> On Thu, May 25, 2017 at 9:34 AM, James McMahon <[email protected]> >> >>>> wrote: >> >>>> > Will do Joe. I'll dig for that now. >> >>>> > >> >>>> > Infrastructure Group did reboot the box, which had been up and >> >>>> > running >> >>>> > for >> >>>> > nearly two months. NiFi did indeed come up following the reboot. I >> >>>> > still >> >>>> > want to try and get you this log information so that I can learn >> >>>> > what >> >>>> > triggers such a situation, and whether there is a more refined way >> >>>> > to >> >>>> > solve >> >>>> > it than full system reboot. There are other things running on the >> >>>> > resource >> >>>> > and I should try to minimize impact to them by fully rebooting. >> >>>> > >> >>>> > Let me see about that log content. Thank you again. >> >>>> > >> >>>> > On Thu, May 25, 2017 at 9:25 AM, Joe Witt <[email protected]> >> >>>> > wrote: >> >>>> >> >> >>>> >> Jim, >> >>>> >> >> >>>> >> The code relevant to that log output is here [1]. Can you share >> >>>> >> the >> >>>> >> bootstrap output before/after that output? >> >>>> >> >> >>>> >> [1] >> >>>> >> >> >>>> >> >> >>>> >> https://github.com/apache/nifi/blob/rel/nifi-0.7.1/nifi-bootstrap/src/main/java/org/apache/nifi/bootstrap/RunNiFi.java >> >>>> >> >> >>>> >> Thanks >> >>>> >> Joe >> >>>> >> >> >>>> >> On Thu, May 25, 2017 at 9:11 AM, James McMahon >> >>>> >> <[email protected]> >> >>>> >> wrote: >> >>>> >> > Am running NiFi 0.7.x. Have been running with great stability >> >>>> >> > for a >> >>>> >> > long >> >>>> >> > period of time. Tried this morning to make this change in my >> >>>> >> > nifi.properties >> >>>> >> > conf file: >> >>>> >> > >> >>>> >> > nifi.content.repository.archive.max.retention.period=1 hour >> >>>> >> > >> >>>> >> > Reduced from the default of 12 hours. Relatively simple change, >> >>>> >> > requires >> >>>> >> > a >> >>>> >> > nifi restart to take effect. >> >>>> >> > >> >>>> >> > My restart attempt throws no errors to the nifi app log, but in >> >>>> >> > the >> >>>> >> > bootstrap log I do see this: >> >>>> >> > org.apache.nifi.bootstrap.RunNiFi Status file no longer exists. >> >>>> >> > Will not >> >>>> >> > restart NiFi >> >>>> >> > >> >>>> >> > I've done some digging and all I could find is rebooting the box >> >>>> >> > in >> >>>> >> > hopes of >> >>>> >> > resolving. Am reaching out to the infrastructure group that owns >> >>>> >> > the >> >>>> >> > server >> >>>> >> > now, asking them to do so. Would like to also in parallel >> >>>> >> > understand why >> >>>> >> > this happened, and where, exactly, this status file should be? >> >>>> >> > >> >>>> >> > Can I resolve this by manually recreating such a status file >> >>>> >> > with >> >>>> >> > certain >> >>>> >> > permissions and ownership? >> >>>> >> > >> >>>> >> > Thanks in advance for your help. -Jim >> >>>> >> > >> >>>> >> > >> >>>> > >> >>>> > >> >>> >> >>> >> >> >> > > >
