I tend to agree with a lot of the points made by James and Pierre... Given that the end user of NiFi is not always a developer, it seems more user-friendly to have the specific processors and not have users trying to come up with the right set of JARs and the right configuration properties (although many power users can do this).
Since the processors we are talking about already exist, and many came from great community contributions, I don't think we should get rid of any of them. If there are inconsistencies that can be improved, such as some processors using EL and others not, then we should definitely make those improvements. On Wed, Feb 22, 2017 at 8:42 AM, Andre <[email protected]> wrote: > Pierre, > > >> I believe NiFi is great for one reason: you have a lot of specialized >> processors that are really easy to use and efficient for what they've been >> designed for. >> > >> Let's ask ourselves the question the other way: with the NiFi registry on >> its way, what is the problem having multiple processors for each back end? >> I don't really see the issue here. OK we have a lot of processors (but I >> believe this is a good point for NiFi, for user experience, for >> advertising, etc. - maybe we should improve the processor listing though, >> but again, this will be part of the NiFi Registry work), it generates a >> heavy NiFi binary (but that will be solved with the registry), but that's >> all, no? >> > > The natural trade-off being fragmentation, code support and consistency? > > Simple example? > > ListS3 = Uses InputRequirement(Requirement.INPUT_FORBIDDEN) > ListGCSBucket = INPUT_FORBIDDEN seems to be absent, however, expression > language is disabled on most properties, suggesting design did not intend > to have input. Simple bug (NIFI-3514), simple fix (PR#1526). > > Yes, no doubts, ListS3 presents S3's properties in clear fashion. Certainly > ListGCSBucket represents GCS metadata as attributes in a more specific way > and this is handy, but that wouldn't be an unmanageable challenge. > > This is not an isolated issue, there are plenty of examples, some as simple > as naming... After all, one could be ultra pedantic for a second and note > the ListGCSBucket does not follow the same convention as ListS3(*). > > > Therefore, while the the examples above are overly trivial, they still > serve as a clear reminder of a very WET vs DRY dilemma. I strongly believe > we should strive to stay in DRY land. > > > Note however, that I am 100% OK with the idea that using HCFS may be overly > complex and possibly undesirable; > > Nonetheless I think we should at least consider Matt's suggestion of using > some refactoring magic, or anything that can help us achieving programatic > ways of promoting consistency across the common features of those > processors (with the registry or not). > > > > I will take the community guidance on this. > > Cheers > > Andre > > > (*) The closer conventional name would probably be ListGCS as no other > ListProcessor seems to define the unit of collection, (i.e. it is ListSFTP > not ListSFTPFolder). I have not raised a JIRA ticket but I suggest the > name to be changed for better user experience.
