Mark, It definitely does. Now Matt's suggestion of using the ValidateRecord to convert files also makes sense. I was assuming (incorrectly) that a flowfile was passed on by the ValidateRecord processor untouched, other than to say it was valid or invalid based on the schema defined in the Schema Access Strategy. I believe I follow what you are saying now.
Thanks Again! Paul On Sun, Nov 5, 2017 at 4:06 PM, Mark Payne <[email protected]> wrote: > Hey Paul, > > So a FlowFile consists only of Attributes and a Stream of bytes. In order > for the ValidateRecord > Processor to validate the data, it needs to convert that data from a > stream of bytes into some > object that it can work with. This is the responsibility of the Record > Reader - to take a bunch of > bytes and create one or more Record objects. The processor is then > responsible for sending those > Record objects on to the next processor in the flow. To do that, it has to > take those Record objects > and convert them back into a stream of bytes. And this is the job of > Record Writer - to take one or > more Record objects and convert them into a stream of bytes (i.e., > serialize them). > > So if there were no Record Writer, then the processor would not be able to > convert the > Record objects into streams of bytes. > > Does this help to clarify things, or only muddy the water worse? :) > > Thanks > -Mark > > On Nov 5, 2017, at 3:53 PM, Paul Riddle <[email protected]> wrote: > > Hi Mark! > > Thanks for the fast response. That does make sense. Since I am not > making any modifications, just validating against a given schema, there is > nothing for the Record Writer to do. I am still a little confused as to > why it is a required Property in the ValidateRecord processor, however. > > Thanks, > Paul > > On Sun, Nov 5, 2017 at 3:46 PM, Mark Payne <[email protected]> wrote: > >> Hey Paul, >> >> That is accurate - the Record Writer chosen will not affect the >> validation process. >> The way that the processor works is to read in records, one at a time, >> from a FlowFile. >> Once a record has been read, it is validated against the given schema. It >> is then written >> to either the 'valid' relationship or the 'invalid' relationship. When >> this happens, the chosen >> Record Writer is used to write it out. >> >> So it would be very common to have a CSV Reader with a CSV Writer or a >> JSON Reader >> with a JSON Writer, for instance. However, you could also configure a CSV >> Reader with >> a JSON Writer, and it will essentially convert the record for you inline. >> >> This is a very common pattern for the record-oriented processors, because >> the records are >> read in, parsed, and turned into a 'Record' object. Once this has >> happened, we can treat that >> Record object the same, whether it was parsed from a CSV file, a JSON >> file, or some custom >> format. This, of course, provides us with some very powerful, reusable >> processors! Once we've >> finished working with that Record object, though, we need to pass it on >> in some way. So we make >> use of a Record Writer to serialize it back out. >> >> Does that all make sense? >> >> Thanks >> -Mark >> >> >> On Nov 5, 2017, at 3:24 PM, Paul Riddle <[email protected]> >> wrote: >> >> Hello All, >> >> In regards to the NiFi 1.4 ValidateRecord processor, it doesn't appear to >> matter what Record Writer I choose. As long as the Record Reader can read >> the incoming flowfile and the Schema Access Strategy validates my flowfile, >> it comes out the "valid" relationship. >> >> Am I missing some other purpose for the Record Writer property in the >> ValidateRecord Processor? If so I would like to understand it better. >> >> Regards, >> Paul >> >> >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon> >> Virus-free. >> www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> >> >> >> > >
