Hello, my first mailing here. I am a Java developer, using Apache Velocity, Drill, Tomcat, Ant, Pentaho ETL, MongoDb, Mysql and more and I am very much a data guy.
I have used Nifi for a while now and started yesterday of coding my first processor. I basically do it to widen my knowledge and learn something new. I started with the idea of combining Apache Velocity - a template engine - with Nifi. So in comes a CSV file, it gets merged with a template containing formatting information and some placeholders (and some limited logic maybe) and out comes a new set of data, formatted differently. So it separates the processing logic from the formatting. One could create HTML, XML, Json or other text based formats from it. Easy to use and very efficient. Now my question is: Should I rather implement the logic this way that I process a whole CSV file - which usually has multiple lines? That would be good for the user as he or she has to deal with only one processor doing the work. But the logic would be more specialized. The other way around, I could code the processor to handle one row of the CSV file and the user will have to come up with a flow that divides the CSV file into multiple flowfiles before my processor can be used. That is not so specialized but it requires more preparation work from the user. I tend to go the second way. Also because there is already a processor that will split a file into multiple flowfiles. But I wanted to hear your opinion of what is the best way to go. Do you have a recommendation for me? (Maybe the answer is to do both?!) Thanks for sharing your thoughts. Uwe