Hey Phil,
NiFi will not spread the content of a single file over multiple partitions. It
will write the content of FlowFile 1 to content repo 1, then write the next
FlowFile to repo 2, etc. so it does round-robin but does not spread a single
FlowFile across multiple repos.
Thanks
-Mark
Sent from my iPhone
> On Dec 11, 2023, at 8:45 PM, Phillip Lord wrote:
>
>
> Hello Nifi comrades,
>
> Here's my scenario...
> Let's say I have a Nifi cluster running on EC2 instances with attached EBS
> volumes serving as their repos. They've split up their content-repos into
> three content-repos per node(cont1, cont2, cont3). Each being a dedicated
> EBS volume. My understanding is that the content-claims for a single file
> can potentially span across more than one of these repos.(correct me if I've
> lost my mind over the years)
> For instance if you have a 1 MB file, and lets say your
> max.content.claim.size is 100KB, that's 10 - 100KB claims(ish) potentially
> split up across the 3 EBS volumes. So if Nifi is trying to move that file to
> S3 or something for instance... it needs to be read from each of the volumes.
>
> Whereas if it was a single EBS volume for the cont-repo... it would read from
> the single volume, which I would think would be more performant? Or does
> spreading out any IO contention across volumes provide more of a benefit?
> I know there's different levels of EBS volumes... but not factoring that in
> for right now.
>
> Appreciate any insight... trying to determine the best configuration.
>
> Thanks,
> Phil
>
>