Pierre,
> I believe NiFi is great for one reason: you have a lot of specialized > processors that are really easy to use and efficient for what they've been > designed for. > > Let's ask ourselves the question the other way: with the NiFi registry on > its way, what is the problem having multiple processors for each back end? > I don't really see the issue here. OK we have a lot of processors (but I > believe this is a good point for NiFi, for user experience, for > advertising, etc. - maybe we should improve the processor listing though, > but again, this will be part of the NiFi Registry work), it generates a > heavy NiFi binary (but that will be solved with the registry), but that's > all, no? > The natural trade-off being fragmentation, code support and consistency? Simple example? ListS3 = Uses InputRequirement(Requirement.INPUT_FORBIDDEN) ListGCSBucket = INPUT_FORBIDDEN seems to be absent, however, expression language is disabled on most properties, suggesting design did not intend to have input. Simple bug (NIFI-3514), simple fix (PR#1526). Yes, no doubts, ListS3 presents S3's properties in clear fashion. Certainly ListGCSBucket represents GCS metadata as attributes in a more specific way and this is handy, but that wouldn't be an unmanageable challenge. This is not an isolated issue, there are plenty of examples, some as simple as naming... After all, one could be ultra pedantic for a second and note the ListGCSBucket does not follow the same convention as ListS3(*). Therefore, while the the examples above are overly trivial, they still serve as a clear reminder of a very WET vs DRY dilemma. I strongly believe we should strive to stay in DRY land. Note however, that I am 100% OK with the idea that using HCFS may be overly complex and possibly undesirable; Nonetheless I think we should at least consider Matt's suggestion of using some refactoring magic, or anything that can help us achieving programatic ways of promoting consistency across the common features of those processors (with the registry or not). I will take the community guidance on this. Cheers Andre (*) The closer conventional name would probably be ListGCS as no other ListProcessor seems to define the unit of collection, (i.e. it is ListSFTP not ListSFTPFolder). I have not raised a JIRA ticket but I suggest the name to be changed for better user experience.