Re: Why are my journal files so large on node 2 of cluster?

2024-03-11 Thread Mark Payne
David,

Makes sense. Large values should not be added as attributes. Attributes are 
designed for String values. Think 100-200 characters, generally. A couple KB 
can be fine, as well, but it can significantly reduce performance. If the 
intent is to “stash the content” so that you can change it and perform 
enrichment, you should take a look at ForkEnrichment / JoinEnrichment 
processors.

Thanks
-Mark

On Mar 11, 2024, at 2:05 PM, David Early  wrote:

Mark,

Yes, it was the flowfile repository.

Of all your points, the large attributes is most likely our issue.  One of our 
folks was caching the flowfile (which can be large occasionally) in an 
attribute ahead of a DB lookup (which would overwrite the contents) and then 
reinstating the content after merging with the DB lookup.

The attribute was not removed after the merge. We have added a couple of items 
to remove the attribute this morning, but the mere presence of it briefly may 
be enough to cause the spikes.

I have since attached a very large disk and I can see the occasionally spikes:


At 22% on a 512G disk, that is over 110G.  What isn't clear is why it is not 
consistently spiking.

We have made some changes to the how long the attribute lives and will monitor 
over the next couple of days, but likely we will need to cache the contents 
somewhere and retrieve them later unless someone knows of a better solution 
here.

Thanks for the guidance

Dave


On Fri, Mar 8, 2024 at 7:05 AM Mark Payne 
mailto:marka...@hotmail.com>> wrote:
Dave,

When you say that the journal files are huge, I presume you mean the FlowFile 
repository?

There are generally 4 things that can cause this:
- OutOfMemoryError causing the FlowFile repo not to properly checkpoint
- Out of Disk Space causing the FlowFile repo not to properly checkpoint
- Out of open file handles causing the FlowFile repo not to properly checkpoint
- Creating a lot of huge attributes on your FlowFiles.

The first 3 situations can be identified by looking for errors in the logs.
For the third one, you need to understand whether or not you’re creating huge 
FlowFile attributes. Generally, attributes should be very small - 100-200 
characters or less, ideally. It’s possible that you have a flow that creates 
huge attributes but the flow is only running on the Primary Node, and Node 2 is 
your Primary Node, which would cause this to occur only on this node.

Thanks
-Mark


> On Mar 7, 2024, at 9:24 PM, David Early via users 
> mailto:users@nifi.apache.org>> wrote:
>
> I have a massive issue: I have a 2 node cluster (using 5 external zookeepers 
> on other boxes), and for some reason on node 2 I have MASSIVE journal files.
>
> I am round robbining data between the nodes, but for some reason node 2 just 
> fills up.  This is the second time this has happened this week.
>
> What should I do?  nifi.properties are the same on both systems (except for 
> local host names)..
>
> Any ideas of what might be causing one node to overload?
>
> Dave
>
>



--
David Early, Ph.D.
david.ea...@grokstream.com
720-470-7460 Cell
[https://ci3.googleusercontent.com/mail-sig/AIorK4ytFrueqWyKLKu2TrMCXdoDWTMEnQxLcsSDLlHSBOyzXbaaJq-i2giAs6TarzTUtUl8iUVecLU]



Re: Why are my journal files so large on node 2 of cluster?

2024-03-11 Thread David Early via users
Mark,

Yes, it was the flowfile repository.

Of all your points, the large attributes is most likely our issue.  One of
our folks was caching the flowfile (which can be large occasionally) in an
attribute ahead of a DB lookup (which would overwrite the contents) and
then reinstating the content after merging with the DB lookup.

The attribute was not removed after the merge. We have added a couple of
items to remove the attribute this morning, but the mere presence of it
briefly may be enough to cause the spikes.

I have since attached a very large disk and I can see the
occasionally spikes:

[image: image.png]
At 22% on a 512G disk, that is over 110G.  What isn't clear is why it is
not consistently spiking.

We have made some changes to the how long the attribute lives and will
monitor over the next couple of days, but likely we will need to cache the
contents somewhere and retrieve them later unless someone knows of a better
solution here.

Thanks for the guidance

Dave


On Fri, Mar 8, 2024 at 7:05 AM Mark Payne  wrote:

> Dave,
>
> When you say that the journal files are huge, I presume you mean the
> FlowFile repository?
>
> There are generally 4 things that can cause this:
> - OutOfMemoryError causing the FlowFile repo not to properly checkpoint
> - Out of Disk Space causing the FlowFile repo not to properly checkpoint
> - Out of open file handles causing the FlowFile repo not to properly
> checkpoint
> - Creating a lot of huge attributes on your FlowFiles.
>
> The first 3 situations can be identified by looking for errors in the logs.
> For the third one, you need to understand whether or not you’re creating
> huge FlowFile attributes. Generally, attributes should be very small -
> 100-200 characters or less, ideally. It’s possible that you have a flow
> that creates huge attributes but the flow is only running on the Primary
> Node, and Node 2 is your Primary Node, which would cause this to occur only
> on this node.
>
> Thanks
> -Mark
>
>
> > On Mar 7, 2024, at 9:24 PM, David Early via users 
> wrote:
> >
> > I have a massive issue: I have a 2 node cluster (using 5 external
> zookeepers on other boxes), and for some reason on node 2 I have MASSIVE
> journal files.
> >
> > I am round robbining data between the nodes, but for some reason node 2
> just fills up.  This is the second time this has happened this week.
> >
> > What should I do?  nifi.properties are the same on both systems (except
> for local host names)..
> >
> > Any ideas of what might be causing one node to overload?
> >
> > Dave
> >
> >
>
>

-- 
David Early, Ph.D.
david.ea...@grokstream.com
720-470-7460 Cell


Re: Why are my journal files so large on node 2 of cluster?

2024-03-08 Thread Mark Payne
Dave,

When you say that the journal files are huge, I presume you mean the FlowFile 
repository?

There are generally 4 things that can cause this:
- OutOfMemoryError causing the FlowFile repo not to properly checkpoint
- Out of Disk Space causing the FlowFile repo not to properly checkpoint
- Out of open file handles causing the FlowFile repo not to properly checkpoint
- Creating a lot of huge attributes on your FlowFiles.

The first 3 situations can be identified by looking for errors in the logs.
For the third one, you need to understand whether or not you’re creating huge 
FlowFile attributes. Generally, attributes should be very small - 100-200 
characters or less, ideally. It’s possible that you have a flow that creates 
huge attributes but the flow is only running on the Primary Node, and Node 2 is 
your Primary Node, which would cause this to occur only on this node.

Thanks
-Mark


> On Mar 7, 2024, at 9:24 PM, David Early via users  
> wrote:
> 
> I have a massive issue: I have a 2 node cluster (using 5 external zookeepers 
> on other boxes), and for some reason on node 2 I have MASSIVE journal files.  
> 
> I am round robbining data between the nodes, but for some reason node 2 just 
> fills up.  This is the second time this has happened this week.
> 
> What should I do?  nifi.properties are the same on both systems (except for 
> local host names)..
> 
> Any ideas of what might be causing one node to overload?
> 
> Dave
> 
> 



Why are my journal files so large on node 2 of cluster?

2024-03-07 Thread David Early via users
I have a massive issue: I have a 2 node cluster (using 5 external
zookeepers on other boxes), and for some reason on node 2 I have MASSIVE
journal files.

I am round robbining data between the nodes, but for some reason node 2
just fills up.  This is the second time this has happened this week.

What should I do?  nifi.properties are the same on both systems (except for
local host names)..

Any ideas of what might be causing one node to overload?

Dave