Thanks.. I will try to implement this suggestion also.

Actually since I earlier have managed a Datawarehouse project, I am trying
to pick up scenarios based on that experience. I have actually visualised
this specific scenario.

The source file will be a CSV file. And all the fields will be be stored as
strings in the csv file(Similar to the source files in a DWH project). Once
the fields are validated for their length, then the data will be stored in
DB table with a unique ID to keep a track of failure at a later stage.. My
plan is to store the data  in the appropriate format at this point with the
unique ID.. for eg) ID and Zipcode to be numeric. 

Once I am able to achieve the above I would like to add a few more fields
later like date, some descriptions of the city also will be added. The
description will also have invalid characters. After this I would like to
implement another set of validations like date format, remove invalid
characters etc. and then store in a different DB.

At every stage I want to also capture the error records and either publish
them or store in a different DB. 

This is goal for now. 

Thanks
Dave



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Reply via email to