Mark - We have only default settings, which I suspect is giving us a heap size of just 512 MB. We will certainly be raising the limit as we analyze our usage at scale to see how much memory we actually need.
-Tim > On May 23, 2018, at 10:17 AM, Mark Payne <[email protected]> wrote: > > Tim, > > Typically when I see that issue, it's due to OutOfMemory Errors or constant > garbage collection. How large is your heap? > FWIW, an alternative is to use "kill -3 <nifi pid>" The -3 will cause java to > perform a thread dump. So you can do "cat run/nifi.pid | xargs kill -3" > > > >> On May 23, 2018, at 11:13 AM, Tim Dean <[email protected] >> <mailto:[email protected]>> wrote: >> >> Joe - >> >> I am currently in this state where new provenance events are not showing up >> in the UI (I have not yet made the configuration changes you suggested below) >> >> When I try to run the dump command you suggested, I get a timeout error: >> >> Java home: /usr/lib/jvm/default-java/jre >> NiFi home: /opt/nifi-1.5.0 >> >> Bootstrap Config File: /opt/nifi-1.5.0/conf/bootstrap.conf >> >> Exception in thread "main" java.net.SocketTimeoutException: Read timed out >> at java.net.SocketInputStream.socketRead0(Native Method) >> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) >> at java.net.SocketInputStream.read(SocketInputStream.java:171) >> at java.net.SocketInputStream.read(SocketInputStream.java:141) >> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) >> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) >> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) >> at java.io.InputStreamReader.read(InputStreamReader.java:184) >> at java.io.BufferedReader.fill(BufferedReader.java:161) >> at java.io.BufferedReader.readLine(BufferedReader.java:324) >> at java.io.BufferedReader.readLine(BufferedReader.java:389) >> at org.apache.nifi.bootstrap.RunNiFi.dump(RunNiFi.java:707) >> at org.apache.nifi.bootstrap.RunNiFi.main(RunNiFi.java:233) >> >> >> The system does seem to be running, although a little slow right now. I’m >> not sure if the error above is expected given that the system is running a >> bit slowly, or if there is something more fundamentally wrong with my >> system. I did restart the NiFi service and it seems to clear out the >> problem. Provenance events are once again showing up in the user interface. >> So even though NiFi seemed to be running (flow files were being processed, >> the user interface was slow but functioning) it appears that provenance >> reporting/indexing and whatever is used by the dump utility were not >> functioning. >> >> We’re in the process of assessing our memory use and adjusting configuration >> as needed, so some of these problems may go away once we’ve tuned the >> system. Other than that, are there specific tools I should be using or >> logging I should be monitoring to track down problems with provenance >> reporting? >> >> Thanks for all your help. >> >> -Tim >> >> >> >>> On May 22, 2018, at 12:38 PM, Joe Witt <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> I agree - good point :) >>> >>> It is possible indexing was stuck with the older implementation. Can >>> you run 'bin/nifi.sh dump' and share the logs/nifi-bootstrap.log >>> file if it is in that state/behavior again? >>> >>> Thanks >>> >>> On Tue, May 22, 2018 at 1:33 PM, Tim Dean <[email protected] >>> <mailto:[email protected]>> wrote: >>>> Thanks Joe - I’ll try those changes and report back with the results. >>>> >>>> Just out of curiosity, if my problem is happening because I am generating >>>> more than 1 GB of provenance data, wouldn’t I expect to see the older >>>> provenance data being deleted leaving the newer provenance data in tact? >>>> It seems to me that my old data is still there and my new data is not. >>>> >>>> -Tim >>>> >>>> >>>>> On May 22, 2018, at 12:15 PM, Joe Witt <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Tim >>>>> >>>>> Got ya. So yeah keep in mind you'll only have at most 1GB of prov >>>>> data and for at most 24 hours with that configuration. Also, as James >>>>> mentioned the default searching for provenance can be too restrictive >>>>> and you have to pay close attention to time stamps relative to the >>>>> system doing the query/etc.. In general though it should work just >>>>> fine. >>>>> >>>>> 1) definitely use the newer provenance. We need to change the default >>>>> as the new one is very fast and very stable. >>>>> >>>>> To do this change >>>>> >>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository >>>>> to >>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository >>>>> >>>>> 2) Change retention period and size values such as >>>>> >>>>> nifi.provenance.repository.max.storage.time=72 hours >>>>> nifi.provenance.repository.max.storage.size=50 GB >>>>> >>>>> There are some other tweaks you can do in terms of >>>>> threads/sharding/etc.. that help with performance but the above are >>>>> good to do now regardless of performance. >>>>> >>>>> Thanks >>>>> >>>>> On Tue, May 22, 2018 at 10:50 AM, Tim Dean <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>>> Thanks Joe: >>>>>> >>>>>> I have not yet made any changes to the configuration. We are just >>>>>> beginning >>>>>> the process of running out flow at scale and figuring out how to best >>>>>> optimize the configuration, and I plan to make changes as needed once we >>>>>> can >>>>>> get the flow functionally correct. Right now I’m having difficulty doing >>>>>> that because the lack of provenance events. >>>>>> >>>>>> Here is the provenance-related properties I have in my nifi.properties >>>>>> file: >>>>>> >>>>>> # Provenance Repository Properties >>>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository >>>>>> nifi.provenance.repository.debug.frequency=1_000_000 >>>>>> nifi.provenance.repository.encryption.key.provider.implementation= >>>>>> nifi.provenance.repository.encryption.key.provider.location= >>>>>> nifi.provenance.repository.encryption.key.id= >>>>>> nifi.provenance.repository.encryption.key= >>>>>> >>>>>> # Persistent Provenance Repository Properties >>>>>> nifi.provenance.repository.directory.default=./provenance_repository >>>>>> nifi.provenance.repository.max.storage.time=24 hours >>>>>> nifi.provenance.repository.max.storage.size=1 GB >>>>>> nifi.provenance.repository.rollover.time=30 secs >>>>>> nifi.provenance.repository.rollover.size=100 MB >>>>>> nifi.provenance.repository.query.threads=2 >>>>>> nifi.provenance.repository.index.threads=2 >>>>>> nifi.provenance.repository.compress.on.rollover=true >>>>>> nifi.provenance.repository.always.sync=false >>>>>> nifi.provenance.repository.journal.count=16 >>>>>> # Comma-separated list of fields. Fields that are not indexed will not be >>>>>> searchable. Valid fields are: >>>>>> # EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, >>>>>> AlternateIdentifierURI, Relationship, Details >>>>>> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, >>>>>> Filename, >>>>>> ProcessorID, Relationship >>>>>> # FlowFile Attributes that should be indexed and made searchable. Some >>>>>> examples to consider are filename, uuid, mime.type >>>>>> nifi.provenance.repository.indexed.attributes= >>>>>> # Large values for the shard size will result in more Java heap usage >>>>>> when >>>>>> searching the Provenance Repository >>>>>> # but should provide better performance >>>>>> nifi.provenance.repository.index.shard.size=500 MB >>>>>> # Indicates the maximum length that a FlowFile attribute can be when >>>>>> retrieving a Provenance Event from >>>>>> # the repository. If the length of any attribute exceeds this value, it >>>>>> will >>>>>> be truncated when the event is retrieved. >>>>>> nifi.provenance.repository.max.attribute.length=65536 >>>>>> nifi.provenance.repository.concurrent.merge.threads=2 >>>>>> nifi.provenance.repository.warm.cache.frequency=1 hour >>>>>> >>>>>> # Volatile Provenance Respository Properties >>>>>> nifi.provenance.repository.buffer.size=100000 >>>>>> >>>>>> >>>>>> Thanks for any help you can provide on this >>>>>> >>>>>> -Tim >>>>>> >>>>>> On May 21, 2018, at 11:23 PM, Joe Witt <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> Tim, >>>>>> >>>>>> The default configuration for provenance event retention is >>>>>> potentially a factor. >>>>>> >>>>>> Did you make any changes to those? Can you share relevant segments >>>>>> from the nifi.properties file? >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Mon, May 21, 2018 at 8:32 PM, Tim Dean <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> I am having a hard time troubleshooting a NiFi flow to see where things >>>>>> are >>>>>> failing. I am trying to look at the provenance repository for a variety >>>>>> of >>>>>> processors, but for some reason nothing more recent seems to be appearing >>>>>> there. For example: >>>>>> >>>>>> At approximately 10:30 this morning I started a flow and observed it for >>>>>> a >>>>>> couple of hours before disabling it to look into a few unexpected >>>>>> results. >>>>>> By right-clicking individual processors and selecting “View data >>>>>> provenance” >>>>>> I can see the NiFi Data Provenance view >>>>>> For each processor I investigate I can see anywhere from 10 to 100 >>>>>> provenance events that came in during the hours I was running my flow >>>>>> A few hours later I restart the flow. Data once again flows through and >>>>>> after a while I stop my flow again >>>>>> Now I again right-click on the processors and select “View data >>>>>> provenance”. >>>>>> No new provenance events seem to show up in the NiFi Data Provenance view >>>>>> >>>>>> >>>>>> I have checked m search filter to make sure I am not accidentally >>>>>> filtering >>>>>> out events. I have looked at the external systems that this flow touches >>>>>> and >>>>>> confirmed that data is/was flowing through these processors. But for some >>>>>> reason I can see no provenance records in the UI. >>>>>> >>>>>> I am using NiFi version 1.5 >>>>>> >>>>>> I have not (yet) changed any of the default settings for NiFi and how its >>>>>> provenance repository is configured >>>>>> >>>>>> Any advice on where my provenance events are going or what I might be >>>>>> doing >>>>>> that causes the provenance system to go silent on me? >>>>>> >>>>>> Thanks >>>>>> >>>>>> -Tim >>>>>> >>>>>> >>>> >> >
