Hi Joe,
Yeah, that's right. Thank you very much for your help.
Cheers,
Ali
On May 18, 2017 1:03 AM, "Joe Skora" wrote:
> I think the notes on multiple locations for a repository are based on
> independent disks not shared storage. That's why I don't think it will
> help in
I think the notes on multiple locations for a repository are based on
independent disks not shared storage. That's why I don't think it will
help in a shared storage environment.
Yes, I can see a potential performance loss if NiFi is given multiple
locations for a repository if the underlying
Hi Joe,
I understand the situation of using DAS and it is a recommended option for
a production environment, but in the case of having a shared storage like
SAN or NAS, I am not sure how we can see a slightly more throughput with
having multiple disk volumes for the content repo.
At the storage
Hi Juan,
Thank you very much, I have already seen those documents. So it is
completely clear to me for a Direct Attached Storage scenario, but I am
investigating the situation of a fully virtualized platform with a shared
storage.
Cheers,
Ali
On Thu, May 18, 2017 at 12:00 AM, Joe Skora
What I meant is that in general, multiple disks have a higher potential
maximum throughput than a single disk. For example, if a single 1TB disk
capable of 160MB/s is split into 4x 250GB volumes the total combined
bandwidth of the volumes is still 160MB/s, but if data is distributed
across four
Ali,
Yes that is correct. Reference section "Apache NIFI in depth" of the NIFI
docs. [1]
And I still use this article for general Apche NIFI best practices when
handling high amounts of data. [2] It is tailored for Apache NIFI pre 1
release but still applies.
[1]
Hi Joe,
Can you please explain what will happen that still we will see a
performance increase through using multiple volumes for each repository? So
practically using different volumes for FlowFile, Provenance and Content
would overcome space collision situation. Based on the mentioned example so
Ali,
If you can separate the repositories onto separate physical spindles I
would expect a performance benefit, but if they are all on virtualized
storage I'd expect less performance benefit from separate volumes. But,
even on virtualized storage, separate volumes can help reduce space
collision
Hi all,
I was wondering whether there is any performance throughput of having
multiple disk mount points for FlowFile, Provenance and Content or using
single mount point for all of them if we are using a fully virtualized
deployment with a shared storage. Suppose we have got 500TB disks in the