if they are physically seperate the diff should be quite noticable.

On Thu, Sep 13, 2018, 7:36 PM Phil H <gippyp...@gmail.com> wrote:

> Potentially. We're looking to see how the multiple disks help before
> committing to spending money on new hardware :)
>
> On Fri, 14 Sep 2018 at 10:48, Joe Witt <joe.w...@gmail.com> wrote:
>
> > phil,
> >
> > as you add dirs it will just start using them.  if you want to no longer
> > use the current dir it might be more involved.
> >
> > does that help?
> >
> > thanks
> >
> > On Thu, Sep 13, 2018, 4:36 PM Phil H <gippyp...@gmail.com> wrote:
> >
> > > Follow up question - how do I transition to this new structure? Should
> I
> > > shut down NiFi and move the contents of the legacy single directories
> > into
> > > one of the new ones? For example:
> > >
> > > mv /usr/nifi/content_repository
> > > /nifi/repos/content-1
> > >
> > > TIA
> > > Phil
> > >
> > >
> > > On Wed, 12 Sep 2018 at 06:15, Mark Payne <marka...@hotmail.com> wrote:
> > >
> > > > Phil,
> > > >
> > > > For the content repository, you can configure the directory by
> changing
> > > > the value of
> > > > the "nifi.content.repository.directory.default" property in
> > > > nifi.properties. The suffix here,
> > > > "default" is the name of this "container". You can have multiple
> > > > containers by adding extra
> > > > properties. So, for example, you could set:
> > > >
> > > > nifi.content.repository.directory.content1=
> > > > /nifi/repos/content-1
> > > >
> > > > nifi.content.repository.directory.content2=/nifi/repos/content-2
> > > > nifi.content.repository.directory.content3=/nifi/repos/content-3
> > > > nifi.content.repository.directory.content4=/nifi/repos/content-4
> > > >
> > > > Similarly, the Provenance Repo property is named
> > > > "nifi.provenance.repository.directory.default"
> > > > and can have any number of "containers":
> > > >
> > > > nifi.provenance.repository.directory.prov1=/nifi/repos/prov-1
> > > > nifi.provenance.repository.directory.prov2=/nifi/repos/prov-2
> > > > nifi.provenance.repository.directory.prov3=/nifi/repos/prov-3
> > > > nifi.provenance.repository.directory.prov4=/nifi/repos/prov-4
> > > >
> > > > When NiFi writes to these, it does a Round Robin so that if you're
> > > writing
> > > > to 4 Flow Files'
> > > > content simultaneously with different threads, you're able to get the
> > > full
> > > > throughput of each
> > > > disk. (So if you have 4 disks for your content repo, each capable of
> > > > writing 100 MB/sec, then
> > > > your effective write rate to the content repo is 400 MB/sec). Similar
> > > with
> > > > Provenance Repository.
> > > >
> > > > Doing this also will allow you to hold a larger 'archive' of content
> > and
> > > > provenance data, because
> > > > it will span the archive across all of the listed directories, as
> well.
> > > >
> > > > Thanks
> > > > -Mark
> > > >
> > > >
> > > >
> > > > > On Sep 11, 2018, at 3:35 PM, Phil H <gippyp...@gmail.com> wrote:
> > > > >
> > > > > Thanks Mark, this is great advice.
> > > > >
> > > > > Disk access is certainly an issue with the current set up. I will
> > > > certainly
> > > > > shoot for NVMe disks in the build. How does NiFi get configured to
> > span
> > > > > it's repositories across multiple physical disks?
> > > > >
> > > > > Thanks,
> > > > > Phil
> > > > >
> > > > > On Wed, 12 Sep 2018 at 01:32, Mark Payne <marka...@hotmail.com>
> > wrote:
> > > > >
> > > > >> Phil,
> > > > >>
> > > > >> As Sivaprasanna mentioned, your bottleneck will certainly depend
> on
> > > your
> > > > >> flow.
> > > > >> There's nothing inherent about NiFi or the JVM, AFAIK that would
> > limit
> > > > >> you. I've
> > > > >> seen NiFi run on VM's containing 4-8 cores, and I've seen it run
> on
> > > bare
> > > > >> metal
> > > > >> on servers containing 96+ cores. Most often, I see people with a
> lot
> > > of
> > > > >> CPU cores
> > > > >> but insufficient disk, so if you're running several cores ensure
> > that
> > > > >> you're using
> > > > >> SSD's / NVMe's or enough spinning disks to accommodate the flow.
> > NiFi
> > > > does
> > > > >> a good
> > > > >> job of spanning the content and FlowFile repositories across
> > multiple
> > > > >> disks to take
> > > > >> full advantage of the hardware, and scales the CPU vertically by
> way
> > > of
> > > > >> multiple
> > > > >> Processors and multiple concurrent tasks (threads) on a given
> > > Processor.
> > > > >>
> > > > >> It really comes down to what you're doing in your flow, though. If
> > > > you've
> > > > >> got 96 cores and
> > > > >> you're trying to perform 5 dozen transformations against a large
> > > number
> > > > of
> > > > >> FlowFiles
> > > > >> but have only a single spinning disk, then those 96 cores will
> > likely
> > > go
> > > > >> to waste, because
> > > > >> your disk will bottleneck you.
> > > > >>
> > > > >> Likewise, if you have 10 SSD's and only 8 cores you're likely
> going
> > to
> > > > >> waste a lot of
> > > > >> disk because you won't have the CPU needed to reach the disks'
> full
> > > > >> potential.
> > > > >> So you'll need to strike the correct balance for your use
> case.Since
> > > you
> > > > >> have the
> > > > >> flow running right now, I would recommend looking at things like
> > `top`
> > > > and
> > > > >> `iostat` in order
> > > > >> to understand if you're reaching your limit on CPU, disk, etc.
> > > > >>
> > > > >> As far as RAM is concerned, NiFI typically only needs 4-8 GB of
> ram
> > > for
> > > > >> the heap. However,
> > > > >> more RAM means that your operating system can make better use of
> > disk
> > > > >> caching, which
> > > > >> can certainly speed things up, especially if you're reading the
> > > content
> > > > >> several times for
> > > > >> each FlowFile.
> > > > >>
> > > > >> Does this help at all?
> > > > >>
> > > > >> Thanks
> > > > >> -Mark
> > > > >>
> > > > >>
> > > > >>> On Sep 10, 2018, at 6:05 AM, Phil H <gippyp...@gmail.com> wrote:
> > > > >>>
> > > > >>> Thanks for that. Sorry I should have been more specific - we
> have a
> > > > flow
> > > > >>> running already on non-dedicated hardware. Looking to identify
> any
> > > > >>> limitations in NiFi/JVM that would limit how much parallelism it
> > can
> > > > take
> > > > >>> advantage of
> > > > >>>
> > > > >>> On Mon, 10 Sep 2018 at 14:32, Sivaprasanna <
> > > sivaprasanna...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Phil,
> > > > >>>>
> > > > >>>> The hardware requirements are driven by the nature of the
> dataflow
> > > you
> > > > >> are
> > > > >>>> developing. If you're looking to play around with NiFi and gain
> > some
> > > > >>>> hands-on experience, go for a 4 core 8GB RAM i.e. any modern
> > > > >>>> laptops/computer would do the job. In my case, where I'm having
> > 100s
> > > > of
> > > > >>>> dataflows, I have it clustered with 3 nodes. Each having 16GB
> RAM
> > > and
> > > > >> 4(8)
> > > > >>>> cores. I went with SSDs of smaller size because my flows are
> > > involved
> > > > in
> > > > >>>> writing to object stores like Google Cloud Storage, Azure Blob
> and
> > > > >> Amazon
> > > > >>>> S3 and NoSQL DBs. Hope this helps.
> > > > >>>>
> > > > >>>> -
> > > > >>>> Sivaprasanna
> > > > >>>>
> > > > >>>> On Mon, Sep 10, 2018 at 4:09 AM Phil H <gippyp...@gmail.com>
> > wrote:
> > > > >>>>
> > > > >>>>> Hi all,
> > > > >>>>>
> > > > >>>>> I've been asked to spec some hardware for a NiFi installation.
> > Does
> > > > >>>> anyone
> > > > >>>>> have any advice? My gut feel is lots of processor cores and
> RAM,
> > > with
> > > > >>>> less
> > > > >>>>> emphasis on storage (small fast disks). Are there any
> limitations
> > > on
> > > > >> how
> > > > >>>>> many cores the JRE/NiFi can actually make use of, or any other
> > > > >>>>> considerations like that I should be aware of?
> > > > >>>>>
> > > > >>>>> Most likely will be pairs of servers in a cluster, but again
> any
> > > > advice
> > > > >>>> to
> > > > >>>>> the contrary would be appreciated.
> > > > >>>>>
> > > > >>>>> Cheers,
> > > > >>>>> Phil
> > > > >>>>>
> > > > >>>>
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to