Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Ali Nazemian
Hi Joe, Yeah, that's right. Thank you very much for your help. Cheers, Ali On May 18, 2017 1:03 AM, "Joe Skora" wrote: > I think the notes on multiple locations for a repository are based on > independent disks not shared storage. That's why I don't think it will > help in

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Joe Skora
I think the notes on multiple locations for a repository are based on independent disks not shared storage. That's why I don't think it will help in a shared storage environment. Yes, I can see a potential performance loss if NiFi is given multiple locations for a repository if the underlying

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Ali Nazemian
Hi Joe, I understand the situation of using DAS and it is a recommended option for a production environment, but in the case of having a shared storage like SAN or NAS, I am not sure how we can see a slightly more throughput with having multiple disk volumes for the content repo. At the storage

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Ali Nazemian
Hi Juan, Thank you very much, I have already seen those documents. So it is completely clear to me for a Direct Attached Storage scenario, but I am investigating the situation of a fully virtualized platform with a shared storage. Cheers, Ali On Thu, May 18, 2017 at 12:00 AM, Joe Skora

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Joe Skora
What I meant is that in general, multiple disks have a higher potential maximum throughput than a single disk. For example, if a single 1TB disk capable of 160MB/s is split into 4x 250GB volumes the total combined bandwidth of the volumes is still 160MB/s, but if data is distributed across four

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Juan Sequeiros
Ali, Yes that is correct. Reference section "Apache NIFI in depth" of the NIFI docs. [1] And I still use this article for general Apche NIFI best practices when handling high amounts of data. [2] It is tailored for Apache NIFI pre 1 release but still applies. [1]

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Ali Nazemian
Hi Joe, Can you please explain what will happen that still we will see a performance increase through using multiple volumes for each repository? So practically using different volumes for FlowFile, Provenance and Content would overcome space collision situation. Based on the mentioned example so

Re: Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Joe Skora
Ali, If you can separate the repositories onto separate physical spindles I would expect a performance benefit, but if they are all on virtualized storage I'd expect less performance benefit from separate volumes. But, even on virtualized storage, separate volumes can help reduce space collision

Fully virutalized Nifi cluster with shared storage

2017-05-17 Thread Ali Nazemian
Hi all, I was wondering whether there is any performance throughput of having multiple disk mount points for FlowFile, Provenance and Content or using single mount point for all of them if we are using a fully virtualized deployment with a shared storage. Suppose we have got 500TB disks in the