Hey guys, Thanks for the thread Andre.
+1 to James' answer. I understand the interest that would provide a single processor to connect to all the back ends... and we could document/improve the PutHDFS to ease such use but I really don't think that it will benefit the user experience. That may be interesting in some cases for some users but I don't think that would be a majority. I believe NiFi is great for one reason: you have a lot of specialized processors that are really easy to use and efficient for what they've been designed for. Let's ask ourselves the question the other way: with the NiFi registry on its way, what is the problem having multiple processors for each back end? I don't really see the issue here. OK we have a lot of processors (but I believe this is a good point for NiFi, for user experience, for advertising, etc. - maybe we should improve the processor listing though, but again, this will be part of the NiFi Registry work), it generates a heavy NiFi binary (but that will be solved with the registry), but that's all, no? Also agree on the positioning aspect: IMO NiFi should not be highly tied to the Hadoop ecosystem. There is a lot of users using NiFi with absolutely no relation to Hadoop. Not sure that would send the good "signal". Pierre 2017-02-22 6:50 GMT+01:00 Andre <andre-li...@fucs.org>: > Andrew, > > > On Wed, Feb 22, 2017 at 11:21 AM, Andrew Grande <apere...@gmail.com> > wrote: > > > I am observing one assumption in this thread. For some reason we are > > implying all these will be hadoop compatible file systems. They don't > > always have an HDFS plugin, nor should they as a mandatory requirement. > > > > You are partially correct. > > There is a direct assumption in the availability of a HCFS (thanks Matt!) > implementation. > > This is the case with: > > * Windows Azure Blob Storage > * Google Cloud Storage Connector > * MapR FileSystem (currently done via NAR recompilation / mvn profile) > * Alluxio > * Isilon (via HDFS) > * others > > But I would't say this will apply to every other use storage system and in > certain cases may not even be necessary (e.g. Isilon scale-out storage may > be reached using its native HDFS compatible interfaces). > > > Untie completely from the Hadoop nar. This allows for effective minifi > > interaction without the weight of hadoop libs for example. Massive size > > savings where it matters. > > > > > Are you suggesting a use case were MiNiFi agents interact directly with > cloud storage, without relying on NiFi hubs to do that? > > > > For the deployment, it's easy enough for an admin to either rely on a > > standard tar or rpm if the NAR modules are already available in the > distro > > (well, I won't talk registry till it arrives). Mounting a common > directory > > on every node or distributing additional jars everywhere, plus configs, > and > > then keeping it consistent across is something which can be avoided by > > simpler packaging. > > > > As long the NAR or RPM supports your use-case, which is not the case of > people running NiFi with MapR-FS for example. For those, a recompilation is > required anyway. A flexible processor may remove the need to recompile (I > am currently playing with the classpath implication to MapR users). > > Cheers >