Inline. On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <[email protected]> wrote:
> Hi Ameer, > > How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured? > [Ameer] It is configured to 1sec - the lowest value allowed. > By default, it's set to '60s'. So, > 1. If ReplaceTextWithMapping ran with the old mapping file > [Ameer] First Processing took place on Day-1. A new Mapping was dropped on Day-1, after Day-1 Processing was over. > 2. and the mapping file was updated for the next processing > [Ameer] Second Processing took place on Day-2. [Ameer] Here assumption was CACHE will be refreshed from the new mapping file dropped a day earlier. But ti diddnt happend. Cache got refreshed in the middle of the flow - not at the very beginnning. Thus few flowfile got old value and later flowfile got new value. > 3. then the flow started processing another CSV file right away line by > line > > In above scenario, some lines in the CSV might get processed with the > old mapping file. After 60s passed from 1, some other lines may get > processed with the new mappings. Is that what you're seeing? > > [Ameer] This is what is happening. But it shouldn't have - becuase new mapping file was already existing before the next processing begin. It should have refresh right at the start - as also suggested by the code of the ReplaceTextWithMapping processor. > BTW, please avoid posting the same question to users and dev at the > same time. I've removed dev address. > [Ameer] Got it. > Thanks, > Koji > > On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <[email protected]> wrote: > > > > Correcting Typo. > > > > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <[email protected]> > wrote: > >> > >> Guys, > >> > >> It seems that NIFI ReplaceTextWithMapping Processors has a BUG with > Refreshing its Mapped file. We are using its functionality in PROD and > getting odd behaviour. > >> > >> Our USAGE Scenario: > >> > >> We use NIFI primarily as a TRANSFORMATION Tool. > >> Our flow involves: > >> > >> Getting a raw csv file. > >> Split the file on per line basis: > >> > >> So from one source flowfile - we may have 10000 flowfile > generated/splitted out. > >> > >> For each of the splitted flow file(flowfiles for individual lines) we > perform transformation on the attributes. > >> We merge these flowfiles back and write the Output file. > >> > >> > >> As part of the transformation in Step#3, we do some mapping for one of > the field in the csv. For this we use ReplaceTextWithMapping Processor. > Also to note we update our mapping file just before starting our flow(ie. > Step #1) > >> > >> Our Issue: > >> > >> We have noted for SAME key we get two DIFFERENT values in two different > flowfiles. > >> We noted that one of the value mapped, existed in an older Mapping file. > >> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash > uptill certain time. And thus return the old value for few mapping file and > then - once in the meanwhile it has refreshed it cache - returned new > updated value. > >> And this cause the issue? > >> > >> Question: > >> > >> Is this a known issue with ReplaceTextWithMapping Processor? > >> If not how can I create an issue for this? > >> How can I confirm this behaviour? > >> > >> Thanks, > >> Ameer Mawia > >> > >> > >> > >> > >> -- > >> http://ca.linkedin.com/in/ameermawia > >> Toronto, ON > >> > > > > > > -- > > http://ca.linkedin.com/in/ameermawia > > Toronto, ON > > > -- http://ca.linkedin.com/in/ameermawia Toronto, ON
