Thanks Joe. The GetDistributedMapCache seems to be working fine. Is there a way to clear DistributedMapCache on demand?
Regards, Sudeep On Thu, Jan 14, 2016 at 12:42 PM, sudeep mishra <[email protected]> wrote: > Upon building the repository we get different .nar files which can be > updated in the lib for my requirement. > Thanks for your help. > > On Thu, Jan 14, 2016 at 9:27 AM, sudeep mishra <[email protected]> > wrote: > >> Is it possible to build the code for only a particular processor? Just >> curious if we can build and deploy a particular processor in an existing >> NiFi environment. >> >> On Wed, Jan 13, 2016 at 9:33 PM, sudeep mishra <[email protected]> >> wrote: >> >>> Thanks Joe. I will try out the patch. >>> >>> On Wed, Jan 13, 2016 at 9:31 PM, Joe Percivall <[email protected]> >>> wrote: >>> >>>> You would need to clone the nifi source from github and then apply the >>>> patch using git. >>>> >>>> Here is how to clone a repo: >>>> https://help.github.com/articles/cloning-a-repository/ >>>> Along with the nifi repo itself: https://github.com/apache/nifi >>>> >>>> and how to apply a patch: >>>> http://makandracards.com/makandra/2521-git-how-to-create-and-apply-patches >>>> >>>> Let me know if you have any other questions, >>>> Joe >>>> - - - - - - >>>> Joseph Percivall >>>> linkedin.com/in/Percivall >>>> e: [email protected] >>>> >>>> >>>> >>>> On Wednesday, January 13, 2016 10:56 AM, sudeep mishra < >>>> [email protected]> wrote: >>>> >>>> >>>> >>>> Thank you very much Joe. >>>> >>>> Can you please let me know how I can use the .patch file? I am using >>>> the NiFi via the binaries... Do I need to setup the source code and build >>>> the same along with the patch? >>>> >>>> Thanks & Regards, >>>> >>>> Sudeep >>>> >>>> >>>> On Wed, Jan 13, 2016 at 9:02 PM, Joe Percivall <[email protected]> >>>> wrote: >>>> >>>> Hello Sudeep, >>>> > >>>> >I put up a patch on the GetDistributedMapCache ticket[1]. Let me know >>>> what you think. >>>> > >>>> >The PutDistributedMapCache processor and GetDistributedMapCache work >>>> with the data as a byte[] so it should be format agnostic. That being said >>>> it will be up to you to know what is in there in order to use it later. >>>> > >>>> >[1] https://issues.apache.org/jira/browse/NIFI-1382 >>>> > >>>> >Joe >>>> >- - - - - - >>>> >Joseph Percivall >>>> >linkedin.com/in/Percivall >>>> >e: [email protected] >>>> > >>>> > >>>> > >>>> > >>>> >On Tuesday, January 12, 2016 11:34 PM, sudeep mishra < >>>> [email protected]> wrote: >>>> > >>>> > >>>> > >>>> >Thanks Joe. >>>> > >>>> >I do not have specific configuration as of now as I am still exploring >>>> NiFi. Though I think it would be helpful to let user store and retrieve the >>>> cache values in different formats json, avro etc. >>>> > >>>> >Thanks & Regards, >>>> > >>>> >Sudeep >>>> > >>>> > >>>> > >>>> > >>>> > >>>> >On Tue, Jan 12, 2016 at 9:15 PM, Joe Percivall <[email protected]> >>>> wrote: >>>> > >>>> >Hello Sudeep, >>>> >> >>>> >> >>>> >>We are currently lacking a "GetDistributedMapCache" processor that >>>> corresponds to the "PutDistributedMapCache". I created a ticket[1] and will >>>> be working on it today. If you have any comments, configuration >>>> suggestions, etc. please let me know or comment on the ticket. >>>> >> >>>> >> >>>> >>[1] https://issues.apache.org/jira/browse/NIFI-1382 >>>> >> >>>> >>Joe >>>> >>- - - - - - >>>> >>Joseph Percivall >>>> >>linkedin.com/in/Percivall >>>> >>e: [email protected] >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >>On Tuesday, January 12, 2016 9:46 AM, sudeep mishra < >>>> [email protected]> wrote: >>>> >> >>>> >> >>>> >> >>>> >>Thanks Matt. >>>> >> >>>> >> >>>> >>In my data flow I am expected to perform certain validations on data. >>>> I am loading some SQLServer data into HDFSusing Sqoop (not part of NiFi >>>> flow). For each record in HDFS file I have to query another database and >>>> then save the validated record again in HDFS which will be processed bysome >>>> Spark jobs. >>>> >> >>>> >> >>>> >>Since I have to query for each record thus I was planning to cache >>>> the database records against which I have to validate the HDFS. Thus I was >>>> evaluating the DistributedCacheServer. But looks like its purpose is >>>> different. Alternatively can we integrate Redis or another distributed >>>> cache with NiFi as I do not see any processor for it. >>>> >> >>>> >> >>>> >>Appreciate your help. >>>> >> >>>> >> >>>> >>Thanks & Regards, >>>> >> >>>> >> >>>> >>Sudeep >>>> >> >>>> >> >>>> >> >>>> >> >>>> >>On Tue, Jan 12, 2016 at 6:59 PM, Matthew Clarke < >>>> [email protected]> wrote: >>>> >> >>>> >>Sudeep, >>>> >>> I was a little off on my second scenario. The >>>> detectduplicate processor uses the distributedcache service all on its >>>> own.. Files that are route through it are loaded into the cache if they do >>>> not already exist in the cache. if they do already exist they are routed >>>> to duplicate. The putDistributedCache processor was a community >>>> contribution to which there are no processor that make use of the info that >>>> it caches. >>>> >>> >>>> >>> We should probably build a processor that would make use of >>>> the data that can be loaded by the putDistributeCache processor. Is there >>>> a particular use case you are trying to solve where this would be >>>> applicable? >>>> >>> >>>> >>> >>>> >>>Thanks, >>>> >>>Matt >>>> >>> >>>> >>> >>>> >>>On Tue, Jan 12, 2016 at 8:11 AM, Matthew Clarke < >>>> [email protected]> wrote: >>>> >>> >>>> >>>Sudeep, >>>> >>>> The DistributedMapCache is typically used to prevent the >>>> consumption of duplicate data by some of the ingest type processors >>>> (GetHBASE, ListHDFS, and ListSFTP). NiFi uses the service to keep a >>>> listing of what has been consumed so the same files are not consumed >>>> multiple times. The Service can also be used to detect if duplicate data >>>> already exists within a NiFi Instance or cluster. This would be the >>>> scenario where some source is pushing data to your NiFi and perhaps they >>>> push the same data more than once. You want to catch these duplicates so >>>> you can perhaps kick them out of your flow. For this you would use the >>>> PutDistributedCache processor to cache all incoming data and then use the >>>> DetectDuplicate processor to find those duplicates. >>>> >>>> >>>> >>>> Was there a different use case you were looking to solve using >>>> the Distributed cache service? >>>> >>>> >>>> >>>> >>>> >>>>Thanks, >>>> >>>>Matt >>>> >>>> >>>> >>>> >>>> >>>>On Tue, Jan 12, 2016 at 4:36 AM, sudeep mishra < >>>> [email protected]> wrote: >>>> >>>> >>>> >>>>Hi, >>>> >>>>> >>>> >>>>> >>>> >>>>>I can cache some data to be used in NiFi flow. I can see the >>>> processor PutDistributedMapCache in the documentation which saves key-value >>>> pairs in DistributedMapCache for NiFi but I do not see any processor to red >>>> this data. How can I read data from DistributedMapCache in my data flow? >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>>Thanks & Regards, >>>> >>>>> >>>> >>>>> >>>> >>>>>Sudeep Shekhar Mishra >>>> >>>>> >>>> >>>>> >>>> >>>> >>>> >>> >>>> >> >>>> >> >>>> >> >>>> >>-- >>>> >> >>>> >>Thanks & Regards, >>>> >> >>>> >> >>>> >>Sudeep Shekhar Mishra >>>> >> >>>> >> >>>> >>+91-9167519029 >>>> >>[email protected] >>>> >> >>>> >> >>>> > >>>> > >>>> >-- >>>> > >>>> >Thanks & Regards, >>>> > >>>> >Sudeep Shekhar Mishra >>>> > >>>> >+91-9167519029 >>>> >[email protected] >>>> > >>>> >>>> >>>> -- >>>> >>>> Thanks & Regards, >>>> >>>> Sudeep Shekhar Mishra >>>> >>>> +91-9167519029 >>>> [email protected] >>>> >>> >>> >>> >>> -- >>> Thanks & Regards, >>> >>> Sudeep Shekhar Mishra >>> >>> +91-9167519029 >>> [email protected] >>> >> >> >> >> -- >> Thanks & Regards, >> >> Sudeep Shekhar Mishra >> >> +91-9167519029 >> [email protected] >> > > > > -- > Thanks & Regards, > > Sudeep Shekhar Mishra > > +91-9167519029 > [email protected] > -- Thanks & Regards, Sudeep Shekhar Mishra +91-9167519029 [email protected]
