Re: [DISCUSS] Scale-out/Object Storage - taming the diversity of processors

Pierre Villard Wed, 22 Feb 2017 03:06:08 -0800

Hey guys,

Thanks for the thread Andre.

+1 to James' answer.

I understand the interest that would provide a single processor to connect
to all the back ends... and we could document/improve the PutHDFS to ease
such use but I really don't think that it will benefit the user experience.
That may be interesting in some cases for some users but I don't think that
would be a majority.

I believe NiFi is great for one reason: you have a lot of specialized
processors that are really easy to use and efficient for what they've been
designed for.

Let's ask ourselves the question the other way: with the NiFi registry on
its way, what is the problem having multiple processors for each back end?
I don't really see the issue here. OK we have a lot of processors (but I
believe this is a good point for NiFi, for user experience, for
advertising, etc. - maybe we should improve the processor listing though,
but again, this will be part of the NiFi Registry work), it generates a
heavy NiFi binary (but that will be solved with the registry), but that's
all, no?

Also agree on the positioning aspect: IMO NiFi should not be highly tied to
the Hadoop ecosystem. There is a lot of users using NiFi with absolutely no
relation to Hadoop. Not sure that would send the good "signal".

Pierre

2017-02-22 6:50 GMT+01:00 Andre <andre-li...@fucs.org>:

> Andrew,
>
>
> On Wed, Feb 22, 2017 at 11:21 AM, Andrew Grande <apere...@gmail.com>
> wrote:
>
> > I am observing one assumption in this thread. For some reason we are
> > implying all these will be hadoop compatible file systems. They don't
> > always have an HDFS plugin, nor should they as a mandatory requirement.
> >
>
> You are partially correct.
>
> There is a direct assumption in the availability of a HCFS (thanks Matt!)
> implementation.
>
> This is the case with:
>
> * Windows Azure Blob Storage
> * Google Cloud Storage Connector
> * MapR FileSystem (currently done via NAR recompilation / mvn profile)
> * Alluxio
> * Isilon (via HDFS)
> * others
>
> But I would't say this will apply to every other use storage system and in
> certain cases may not even be necessary (e.g. Isilon scale-out storage may
> be reached using its native HDFS compatible interfaces).
>
>
> Untie completely from the Hadoop nar. This allows for effective minifi
> > interaction without the weight of hadoop libs for example. Massive size
> > savings where it matters.
> >
> >
> Are you suggesting a use case were MiNiFi agents interact directly with
> cloud storage, without relying on NiFi hubs to do that?
>
>
> > For the deployment, it's easy enough for an admin to either rely on a
> > standard tar or rpm if the NAR modules are already available in the
> distro
> > (well, I won't talk registry till it arrives). Mounting a common
> directory
> > on every node or distributing additional jars everywhere, plus configs,
> and
> > then keeping it consistent across is something which can be avoided by
> > simpler packaging.
> >
>
> As long the NAR or RPM supports your use-case, which is not the case of
> people running NiFi with MapR-FS for example. For those, a recompilation is
> required anyway. A flexible processor may remove the need to recompile (I
> am currently playing with the classpath implication to MapR users).
>
> Cheers
>

Re: [DISCUSS] Scale-out/Object Storage - taming the diversity of processors

Reply via email to