Adam, On 23 Feb 2017 4:43 AM, "Adam Lamar" <adamond...@gmail.com> wrote:
Hey all, I can understand Andre's perspective - when I was building the ListS3 processor, I mostly just copied the bits that made sense from ListHDFS and ListFile. That worked, but its a poor way to ensure consistency across List* processors. Been there, done that and continute to do that :-) This is particularly tricky however, because those processors drift apart once the first iteration is made. Bugs get fixed in one processor but not on the other. As a once-in-a-while contributor, I love the idea that community contributions are respected and we're not dropping them, because they solve real needs right now, and it isn't clear another approach would be better. I feel your pain and good news is that removing them is a breaking change anyhow... plus we all love the *S3 processors. :-) And I disagree slightly with the notion that an artifact registry will solve the problem - I think it could make it worse, at least from a consistency point of view. 100% agreed. Taming _is_ important, which is one reason registry communities have official/sanctioned modules. Quality and interoperability can vary vastly. 100% agreed. Just look at maven, jcenter, PyPI... I suspect you will agree with idea that the user would think twice about using a 3rd party processor due to the fear of it obsoleted by later version upgrades? By convention, it seems like NiFi already has a handful of well-understood patterns - List, Fetch, Get, Put, etc all mean something specific in processor terms. Is there a reason not to formalize those patterns in the code as well? That would help with processor consistency, and if done right, it may even be easier to write new processors, fix bugs, etc. 100% agreed. My suggestion was HCFS but our dislike for this approach should not preclude us from achieving the final goal: Currently consistency isn't easily maintained, it would be great if it did. Thank lots for your comments, truly appreciated.