Thanks Pierre, Simon and Bryan. Let me take a look and come back with few more questions
On Thu, Apr 28, 2016 at 11:32 AM, Simon Ball <[email protected]> wrote: > GetMongo is an ingest only processor, so cannot accept and input flow > file. It also only has a success relation. > > A solution to this would be to use NiFi’s own deduplication. > > One Flow would seed values in the distributed cache by using GetMongo to > pull the ids and PutDistributedMapCache to store them in NiFi’s cache. > > The main ingest flow would then use UpdateAttributes to create a > hash.value that matched the values inserted to the cache -> > DetectDuplicates -> flow to PutMongo (use the upset property) -success-> > PutSolrContentStream > > Simon > > On Apr 28, 2016, at 5:19 PM, Pierre Villard <[email protected]> > wrote: > > Hi Susheel, > > 1. HandleHttpRequest > 2. RouteOnAttribute + HandleHttpResponse in case of errors detected in > headers > 3. Depending of what you want, there are a lot of options to handle JSON > data (EvaluateJsonPath will probably useful) > 4. GetMongo (I think it will route on success in case there is an entry, > and to failure if there is no record, but this has to be checked, otherwise > an addional processor will do the job to check the result of the request). > 5. & 6. PutMongo + PutFile (if local folder) + PutSolr (if you want to do > Solr by yourself). > > Depending of the details, this could be slightly different, but I think it > gives a good idea of the minimal set of processor you would need. > > HTH, > Pierre > > > 2016-04-28 16:54 GMT+02:00 Susheel Kumar <[email protected]>: > >> Hi, >> >> After attending meetup in NYC, I am realizing NiFi can be used for the >> data flow use case I have. Can someone please share the steps/processors >> necessary for below use case. >> >> >> 1. Receive JSON on a HTTP REST end point >> 2. Parse Http Header and do validation. Return Error code & messages >> as JSON to the response in case of validation failures >> 3. Parse request JSON, perform various validations (missing data in >> fields), massages some data, add some data >> 4. Check if the request JSON unique ID is present in MongoDB and >> compare timestamp to validate if this is an update request or a new >> request >> 5. If new request, an entry is made in mongo and then JSON files are >> written to output folder for another process to pick up and submit to >> Solr. >> 6. If update request, mongo record is updated and JSON files are >> written to output folder >> >> >> I understand that something like HandleHttpRequest Processor can be used >> for receiving http request and then use PutSolrContentStream for writing to >> Solr but not clear on what processors will be used for validation etc. >> steps 2 thru 5 above. >> >> Appreciate your input. >> >> Thanks, >> Susheel >> >> >> >> >> > >
