Mark -

We have only default settings, which I suspect is giving us a heap size of just 
512 MB. We will certainly be raising the limit as we analyze our usage at scale 
to see how much memory we actually need.

-Tim


> On May 23, 2018, at 10:17 AM, Mark Payne <[email protected]> wrote:
> 
> Tim,
> 
> Typically when I see that issue, it's due to OutOfMemory Errors or constant 
> garbage collection. How large is your heap?
> FWIW, an alternative is to use "kill -3 <nifi pid>" The -3 will cause java to 
> perform a thread dump. So you can do "cat run/nifi.pid | xargs kill -3"
> 
> 
> 
>> On May 23, 2018, at 11:13 AM, Tim Dean <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Joe -
>> 
>> I am currently in this state where new provenance events are not showing up 
>> in the UI (I have not yet made the configuration changes you suggested below)
>> 
>> When I try to run the dump command you suggested, I get a timeout error:
>> 
>> Java home: /usr/lib/jvm/default-java/jre
>> NiFi home: /opt/nifi-1.5.0
>> 
>> Bootstrap Config File: /opt/nifi-1.5.0/conf/bootstrap.conf
>> 
>> Exception in thread "main" java.net.SocketTimeoutException: Read timed out
>> at java.net.SocketInputStream.socketRead0(Native Method)
>> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>> at java.net.SocketInputStream.read(SocketInputStream.java:171)
>> at java.net.SocketInputStream.read(SocketInputStream.java:141)
>> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>> at java.io.InputStreamReader.read(InputStreamReader.java:184)
>> at java.io.BufferedReader.fill(BufferedReader.java:161)
>> at java.io.BufferedReader.readLine(BufferedReader.java:324)
>> at java.io.BufferedReader.readLine(BufferedReader.java:389)
>> at org.apache.nifi.bootstrap.RunNiFi.dump(RunNiFi.java:707)
>> at org.apache.nifi.bootstrap.RunNiFi.main(RunNiFi.java:233)
>> 
>> 
>> The system does seem to be running, although a little slow right now. I’m 
>> not sure if the error above is expected given that the system is running a 
>> bit slowly, or if there is something more fundamentally wrong with my 
>> system. I did restart the NiFi service and it seems to clear out the 
>> problem. Provenance events are once again showing up in the user interface. 
>> So even though NiFi seemed to be running (flow files were being processed, 
>> the user interface was slow but functioning) it appears that provenance 
>> reporting/indexing and whatever is used by the dump utility were not 
>> functioning.
>> 
>> We’re in the process of assessing our memory use and adjusting configuration 
>> as needed, so some of these problems may go away once we’ve tuned the 
>> system. Other than that, are there specific tools I should be using or 
>> logging I should be monitoring to track down problems with provenance 
>> reporting?
>> 
>> Thanks for all your help.
>> 
>> -Tim
>> 
>> 
>> 
>>> On May 22, 2018, at 12:38 PM, Joe Witt <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> I agree - good point :)
>>> 
>>> It is possible indexing was stuck with the older implementation.  Can
>>> you run   'bin/nifi.sh dump' and share the logs/nifi-bootstrap.log
>>> file if it is in that state/behavior again?
>>> 
>>> Thanks
>>> 
>>> On Tue, May 22, 2018 at 1:33 PM, Tim Dean <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> Thanks Joe - I’ll try those changes and report back with the results.
>>>> 
>>>> Just out of curiosity, if my problem is happening because I am generating 
>>>> more than 1 GB of provenance data, wouldn’t I expect to see the older 
>>>> provenance data being deleted leaving the newer provenance data in tact? 
>>>> It seems to me that my old data is still there and my new data is not.
>>>> 
>>>> -Tim
>>>> 
>>>> 
>>>>> On May 22, 2018, at 12:15 PM, Joe Witt <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>> 
>>>>> Tim
>>>>> 
>>>>> Got ya.  So yeah keep in mind you'll only have at most 1GB of prov
>>>>> data and for at most 24 hours with that configuration.  Also, as James
>>>>> mentioned the default searching for provenance can be too restrictive
>>>>> and you have to pay close attention to time stamps relative to the
>>>>> system doing the query/etc..  In general though it should work just
>>>>> fine.
>>>>> 
>>>>> 1) definitely use the newer provenance.  We need to change the default
>>>>> as the new one is very fast and very stable.
>>>>> 
>>>>> To do this change
>>>>> 
>>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository
>>>>> to
>>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository
>>>>> 
>>>>> 2) Change retention period and size values such as
>>>>> 
>>>>> nifi.provenance.repository.max.storage.time=72 hours
>>>>> nifi.provenance.repository.max.storage.size=50 GB
>>>>> 
>>>>> There are some other tweaks you can do in terms of
>>>>> threads/sharding/etc.. that help with performance but the above are
>>>>> good to do now regardless of performance.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> On Tue, May 22, 2018 at 10:50 AM, Tim Dean <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>> Thanks Joe:
>>>>>> 
>>>>>> I have not yet made any changes to the configuration. We are just 
>>>>>> beginning
>>>>>> the process of running out flow at scale and figuring out how to best
>>>>>> optimize the configuration, and I plan to make changes as needed once we 
>>>>>> can
>>>>>> get the flow functionally correct. Right now I’m having difficulty doing
>>>>>> that because the lack of provenance events.
>>>>>> 
>>>>>> Here is the provenance-related properties I have in my nifi.properties 
>>>>>> file:
>>>>>> 
>>>>>> # Provenance Repository Properties
>>>>>> nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository
>>>>>> nifi.provenance.repository.debug.frequency=1_000_000
>>>>>> nifi.provenance.repository.encryption.key.provider.implementation=
>>>>>> nifi.provenance.repository.encryption.key.provider.location=
>>>>>> nifi.provenance.repository.encryption.key.id=
>>>>>> nifi.provenance.repository.encryption.key=
>>>>>> 
>>>>>> # Persistent Provenance Repository Properties
>>>>>> nifi.provenance.repository.directory.default=./provenance_repository
>>>>>> nifi.provenance.repository.max.storage.time=24 hours
>>>>>> nifi.provenance.repository.max.storage.size=1 GB
>>>>>> nifi.provenance.repository.rollover.time=30 secs
>>>>>> nifi.provenance.repository.rollover.size=100 MB
>>>>>> nifi.provenance.repository.query.threads=2
>>>>>> nifi.provenance.repository.index.threads=2
>>>>>> nifi.provenance.repository.compress.on.rollover=true
>>>>>> nifi.provenance.repository.always.sync=false
>>>>>> nifi.provenance.repository.journal.count=16
>>>>>> # Comma-separated list of fields. Fields that are not indexed will not be
>>>>>> searchable. Valid fields are:
>>>>>> # EventType, FlowFileUUID, Filename, TransitURI, ProcessorID,
>>>>>> AlternateIdentifierURI, Relationship, Details
>>>>>> nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, 
>>>>>> Filename,
>>>>>> ProcessorID, Relationship
>>>>>> # FlowFile Attributes that should be indexed and made searchable.  Some
>>>>>> examples to consider are filename, uuid, mime.type
>>>>>> nifi.provenance.repository.indexed.attributes=
>>>>>> # Large values for the shard size will result in more Java heap usage 
>>>>>> when
>>>>>> searching the Provenance Repository
>>>>>> # but should provide better performance
>>>>>> nifi.provenance.repository.index.shard.size=500 MB
>>>>>> # Indicates the maximum length that a FlowFile attribute can be when
>>>>>> retrieving a Provenance Event from
>>>>>> # the repository. If the length of any attribute exceeds this value, it 
>>>>>> will
>>>>>> be truncated when the event is retrieved.
>>>>>> nifi.provenance.repository.max.attribute.length=65536
>>>>>> nifi.provenance.repository.concurrent.merge.threads=2
>>>>>> nifi.provenance.repository.warm.cache.frequency=1 hour
>>>>>> 
>>>>>> # Volatile Provenance Respository Properties
>>>>>> nifi.provenance.repository.buffer.size=100000
>>>>>> 
>>>>>> 
>>>>>> Thanks for any help you can provide on this
>>>>>> 
>>>>>> -Tim
>>>>>> 
>>>>>> On May 21, 2018, at 11:23 PM, Joe Witt <[email protected] 
>>>>>> <mailto:[email protected]>> wrote:
>>>>>> 
>>>>>> Tim,
>>>>>> 
>>>>>> The default configuration for provenance event retention is
>>>>>> potentially a factor.
>>>>>> 
>>>>>> Did you make any changes to those?  Can you share relevant segments
>>>>>> from the nifi.properties file?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> On Mon, May 21, 2018 at 8:32 PM, Tim Dean <[email protected] 
>>>>>> <mailto:[email protected]>> wrote:
>>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> I am having a hard time troubleshooting a NiFi flow to see where things 
>>>>>> are
>>>>>> failing. I am trying to look at the provenance repository for a variety 
>>>>>> of
>>>>>> processors, but for some reason nothing more recent seems to be appearing
>>>>>> there. For example:
>>>>>> 
>>>>>> At approximately 10:30 this morning I started a flow and observed it for 
>>>>>> a
>>>>>> couple of hours before disabling it to look into a few unexpected 
>>>>>> results.
>>>>>> By right-clicking individual processors and selecting “View data 
>>>>>> provenance”
>>>>>> I can see the NiFi Data Provenance view
>>>>>> For each processor I investigate I can see anywhere from 10 to 100
>>>>>> provenance events that came in during the hours I was running my flow
>>>>>> A few hours later I restart the flow. Data once again flows through and
>>>>>> after a while I stop my flow again
>>>>>> Now I again right-click on the processors and select “View data 
>>>>>> provenance”.
>>>>>> No new provenance events seem to show up in the NiFi Data Provenance view
>>>>>> 
>>>>>> 
>>>>>> I have checked m search filter to make sure I am not accidentally 
>>>>>> filtering
>>>>>> out events. I have looked at the external systems that this flow touches 
>>>>>> and
>>>>>> confirmed that data is/was flowing through these processors. But for some
>>>>>> reason I can see no provenance records in the UI.
>>>>>> 
>>>>>> I am using NiFi version 1.5
>>>>>> 
>>>>>> I have not (yet) changed any of the default settings for NiFi and how its
>>>>>> provenance repository is configured
>>>>>> 
>>>>>> Any advice on where my provenance events are going or what I might be 
>>>>>> doing
>>>>>> that causes the provenance system to go silent on me?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> -Tim
>>>>>> 
>>>>>> 
>>>> 
>> 
> 

Reply via email to