Ali,

Yes that is correct. Reference section "Apache NIFI in depth" of the NIFI
docs. [1]
And I still use this article for general Apche NIFI best practices when
handling high amounts of data. [2] It is tailored for Apache NIFI pre 1
release but still applies.

[1]
https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#repositories
[2]
https://community.hortonworks.com/content/kbentry/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html


On Wed, May 17, 2017 at 9:27 AM Ali Nazemian <[email protected]> wrote:

> Hi Joe,
>
> Can you please explain what will happen that still we will see a
> performance increase through using multiple volumes for each repository? So
> practically using different volumes for FlowFile, Provenance and Content
> would overcome space collision situation. Based on the mentioned example so
> 100GB FlowFile, 1TB prov and 4TB Content Repo should still have less
> throughput than 100GB FlowFile, 2x500GB prov and 8x500GB content repo in
> practice for a fully virtualized environment.
>
> Regards,
> Ali
>
> On Wed, May 17, 2017 at 10:06 PM, Joe Skora <[email protected]> wrote:
>
>> Ali,
>>
>> If you can separate the repositories onto separate physical spindles I
>> would expect a performance benefit, but if they are all on virtualized
>> storage I'd expect less performance benefit from separate volumes.  But,
>> even on virtualized storage, separate volumes can help reduce space
>> collision problems, preventing runaway system logs or the provenance
>> repository, for instance, from filling the disk and running the content
>> repository out of space.
>>
>> Regards,
>> Joe S
>>
>> On Wed, May 17, 2017 at 5:00 AM, Ali Nazemian <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> I was wondering whether there is any performance throughput of having
>>> multiple disk mount points for FlowFile, Provenance and Content or using
>>> single mount point for all of them if we are using a fully virtualized
>>> deployment with a shared storage. Suppose we have got 500TB disks in the
>>> Share Storage. Which one do you suggest: 100 GB for FlowFile 2x500GB for
>>> Provenance and 8x500GB for the Content repository or using a single mount
>>> point of 5.1TB for the entire instance? In another word, it would be better
>>> Nifi keeps track of load among the disk mount points or delegate it
>>> entirely to the shared storage?
>>>
>>> Regards,
>>> Ali
>>>
>>
>>
>
>
> --
> A.Nazemian
>

Reply via email to