Re: [DISCUSS] Scale-out/Object Storage - taming the diversity of processors

Andre Wed, 22 Feb 2017 13:03:06 -0800

Adam,

On 23 Feb 2017 4:43 AM, "Adam Lamar" <adamond...@gmail.com> wrote:


Hey all,

I can understand Andre's perspective - when I was building the ListS3
processor, I mostly just copied the bits that made sense from ListHDFS and
ListFile. That worked, but its a poor way to ensure consistency across
List* processors.


Been there,  done that and continute to do that :-)

This is particularly tricky however,  because those processors drift apart
once the first iteration is made. Bugs get fixed in one processor but not
on the other.

As a once-in-a-while contributor, I love the idea that community
contributions are respected and we're not dropping them, because they solve
real needs right now, and it isn't clear another approach would be better.


I feel your pain and good news is that removing them is a breaking change
anyhow...

 plus we all love the *S3 processors.  :-)

And I disagree slightly with the notion that an artifact registry will
solve the problem - I think it could make it worse, at least from a
consistency point of view.


100% agreed.

Taming _is_ important, which is one reason
registry communities have official/sanctioned modules. Quality and
interoperability can vary vastly.


100% agreed. Just look at maven, jcenter, PyPI...


I suspect you will agree with idea that the user would think twice about
using a 3rd party processor due to the fear of it obsoleted by later
version upgrades?



By convention, it seems like NiFi already has a handful of well-understood
patterns - List, Fetch, Get, Put, etc all mean something specific in
processor terms. Is there a reason not to formalize those patterns in the
code as well? That would help with processor consistency, and if done
right, it may even be easier to write new processors, fix bugs, etc.


100% agreed. My suggestion was HCFS but our dislike for this approach
should not preclude us from achieving the final goal:

Currently consistency isn't easily maintained, it would be great if it did.

Thank lots for your comments, truly appreciated.

Re: [DISCUSS] Scale-out/Object Storage - taming the diversity of processors

Reply via email to