Hey Ian! Welcome to the club! I love some of those ideas.
I have worked with a lot of organizations and having a default glow on the graph has never really worked. If you think about it, with large user communities it is very difficult to target the processors that everyone would care about. Each client or enterprise is just too different. But I like the idea of processor template examples. This might get tricky when it comes to versions of processors your template would rely on but the idea is solid. I also love the idea of the logger on the connections. You could use the "generated flow file" processor to generate fake data. Have fun exploring! Corey Sent from my iPhone > On Aug 30, 2015, at 2:15 PM, Ian Ragsdale <[email protected]> wrote: > > Hi all, I discovered NiFi yesterday and spent a good portion of today and > yesterday testing it out. Having spent some time playing with some existing > ESB products that require a compile and deploy step for each change, it's > really awesome to be able to quickly tweak a flow and immediately see the > results - that shorter cycle time is great. That said, I'd like to provide > some observations from the point of view of a beginner using the product for > the first time. > > The documentation is very thorough - as I was trying to get a sense of the > existing processors and how they fit together, it was really nice to be able > to pull up the usage for each one, and the inline help is also very nice. > Great job on that stuff! However, it does seem to lean a bit more to the > reference side than a real usage description - I had to experiment for a > while to figure out how to generate a useful response to an incoming HTTP > request, and I'm still not sure I'm handling that in the best way (I ended up > using ExecuteStreamCommand to call a script). Is there another way to > generate arbitrary content? > > I think you could maybe sidestep the need for that level of documentation if > the default install came with some useful examples. The template > functionality seems to be pretty powerful, so one thing to consider might be > allowing NARs to include example templates that show how their processors > should be used - having a bunch of samples to choose from would allow users > to explore the template system immediately without having to figure out how > to define their own first. > > Imagine how much more quickly a beginner could get started if the initial > canvas had a nicely curated set of examples instead of being a blank state. > If they were well organized into process groups by use case, it would be easy > for a user to navigate around and get a sense of what's possible and > understand the expected usage of the various components. I could see that > either being the default behavior, or having a script that would go through > and add sample templates from each NAR via the API, and you could point users > at the script in the getting started docs. Or, if there were no defined > components when the server started up, it could suggest running that script > to the user on the command line. Finally, it would be neat if one of the > sample flows allowed clearing the canvas via the API, so once the user has > gotten a good understanding of how the system works, they can have a clean > slate to start defining their own. > > Another minor suggestion for helping new users get up to speed would be to > make the LogAttribute processor and bulletin board a little easier to figure > out. Once I figured out that LogAttribute logs went to the log directory > instead of somewhere viewable in the UI (which would probably be useful to > add to the usage docs for it), it was very helpful for understanding what was > going on. I think it might've been easier to figure out if it logged to the > bulletin board instead, and maybe the bulletin board could use some sort of > indicator that there are messages outstanding. > > One other thing I would consider would be to drop the LogAttribute processor > altogether and make it an option on connections instead. Every time I've > wanted to use one, I'm just dropping it in between two processors that > already have a connection, and then I have to reroute the existing connection > and add a new one. If I could instead just toggle some attributes on an > existing connection, that would be way easier. It would be even cooler if > there were a sampling mechanism. If I could say 100% or 0.001% of FlowFiles > would be logged to the bulletin board, that would be a really easy way to > debug a flow I'm just setting up, or to monitor or troubleshoot an existing > flow. > > It looks like there are already some AWS integrations being added, so I'm > happy to hear about that - think that's going to make NiFi very useful for a > large set of people that are just getting started learning to use things like > distributed queueing systems or have simple needs that are handled by S3, > SQS, and SNS. > > I think I would spend some time thinking about building a central directory > for NARs sooner rather than later - I think a nice list of available > contributed processors would incent people writing their own to share them, > and would help make it clear when there are already solutions to handling a > given problem or integration. > > Anyway, that's probably enough of a brain dump for now. I think this is a > really great product and a little more attention to the out of box experience > for beginners would really help drive adoption. > > Thanks for listening, > Ian >
