Ali,

You have a lot of nice resources to work with there.  I'd recommend the
series of RAID-1 configuration personally provided you keep in mind this
means you can only lose a single disk for any one partition.  As long as
they're being monitored and would be quickly replaced this in practice
works well.  If there could be lapses in monitoring or time to replace then
it is perhaps safer to go with more redundancy or an alternative RAID type.

I'd say do the OS, app installs w/user and audit db stuff, application logs
on one physical RAID volume.  Have a dedicated physical volume for the flow
file repository.  It will not be able to use all the space but it certainly
could benefit from having no other contention.  This could be a great thing
to have SSDs for actually.  And for the remaining volumes split them up for
content and provenance as you have.  You get to make the overall
performance versus retention decision.  Frankly, you have a great system to
work with and I suspect you're going to see excellent results anyway.

Conservatively speaking expect say 50MB/s of throughput per volume in the
content repository so if you end up with 8 of them could achieve upwards of
400MB/s sustained.  You'll also then want to make sure you have a good 10G
based network setup as well.  Or, you could dial back on the speed tradeoff
and simply increase retention or disk loss tolerance.  Lots of ways to play
the game.

There are no published SSD vs HDD performance benchmarks that I am aware of
though this is a good idea.  Having a hybrid of SSDs and HDDs could offer a
really solid performance/retention/cost tradeoff.  For example having SSDs
for the OS/logs/provenance/flowfile with HDDs for the content - that would
be quite nice.  At that rate to take full advantage of the system you'd
need to have very strong network infrastructure between NiFi and any
systems it is interfacing with  and your flows would need to be well tuned
for GC/memory efficiency.

Thanks
Joe

On Thu, Oct 13, 2016 at 2:50 AM, Ali Nazemian <alinazem...@gmail.com> wrote:

> Dear Nifi Users/ developers,
> Hi,
>
> I was wondering is there any benchmark about the question that is it
> better to dedicate disk control to Nifi or using RAID for this purpose? For
> example, which of these scenarios is recommended from the performance point
> of view?
> Scenario 1:
> 24 disk in total
> 2 disk- raid 1 for OS and fileflow repo
> 2 disk- raid 1 for provenance repo1
> 2 disk- raid 1 for provenance repo2
> 2 disk- raid 1 for content repo1
> 2 disk- raid 1 for content repo2
> 2 disk- raid 1 for content repo3
> 2 disk- raid 1 for content repo4
> 2 disk- raid 1 for content repo5
> 2 disk- raid 1 for content repo6
> 2 disk- raid 1 for content repo7
> 2 disk- raid 1 for content repo8
> 2 disk- raid 1 for content repo9
>
>
> Scenario 2:
> 24 disk in total
> 2 disk- raid 1 for OS and fileflow repo
> 4 disk- raid 10 for provenance repo1
> 18 disk- raid 10 for content repo1
>
> Moreover, is there any benchmark for SSD vs HDD performance for Nifi?
> Thank you very much.
>
> Best regards,
> Ali
>

Reply via email to