Pierre,

> I believe NiFi is great for one reason: you have a lot of specialized
> processors that are really easy to use and efficient for what they've been
> designed for.
>

> Let's ask ourselves the question the other way: with the NiFi registry on
> its way, what is the problem having multiple processors for each back end?
> I don't really see the issue here. OK we have a lot of processors (but I
> believe this is a good point for NiFi, for user experience, for
> advertising, etc. - maybe we should improve the processor listing though,
> but again, this will be part of the NiFi Registry work), it generates a
> heavy NiFi binary (but that will be solved with the registry), but that's
> all, no?
>

The natural trade-off being fragmentation, code support and consistency?

Simple example?

ListS3 = Uses InputRequirement(Requirement.INPUT_FORBIDDEN)
ListGCSBucket = INPUT_FORBIDDEN seems to be absent, however, expression
language is disabled on most properties, suggesting design did not intend
to have input. Simple bug (NIFI-3514), simple fix (PR#1526).

Yes, no doubts, ListS3 presents S3's properties in clear fashion. Certainly
ListGCSBucket represents  GCS metadata as attributes in a more specific way
and this is handy, but that wouldn't be an unmanageable challenge.

This is not an isolated issue, there are plenty of examples, some as simple
as naming...  After all, one could be ultra pedantic for a second and note
the ListGCSBucket does not follow the same convention as ListS3(*).


Therefore, while the the examples above are overly trivial, they still
serve as a clear reminder of a very WET vs DRY dilemma. I strongly believe
we should strive to stay in DRY land.


Note however, that I am 100% OK with the idea that using HCFS may be overly
complex and possibly undesirable;

Nonetheless I think we should at least consider Matt's suggestion of using
some refactoring magic, or anything that can help us achieving programatic
ways of promoting consistency across the common features of those
processors (with the registry or not).



I will take the community guidance on this.

Cheers

Andre


(*) The closer conventional name would probably be ListGCS as no other
ListProcessor seems to define the unit of collection, (i.e. it is ListSFTP
not ListSFTPFolder).  I have not raised a JIRA ticket but I suggest the
name to be changed for better user experience.

Reply via email to