Hi Ameer, Is the ReplaceTextWithMapping's 'Concurrent Tasks' set to grater than 1? Since ReplaceTextWithMapping only reload at a single thread, other threads may use old mapping until the loading thread complete refreshing mapping definition.
Thanks, Koji On Wed, Jul 24, 2019 at 4:28 AM Ameer Mawia <[email protected]> wrote: > > Inline. > > On Mon, Jul 22, 2019 at 2:17 AM Koji Kawamura <[email protected]> wrote: >> >> Hi Ameer, >> >> How is ReplaceTextWithMapping 'Mapping File Refresh Interval' configured? > > [Ameer] It is configured to 1sec - the lowest value allowed. >> >> By default, it's set to '60s'. So, >> 1. If ReplaceTextWithMapping ran with the old mapping file > > [Ameer] First Processing took place on Day-1. A new Mapping was dropped on > Day-1, after Day-1 Processing was over. >> >> 2. and the mapping file was updated for the next processing > > [Ameer] Second Processing took place on Day-2. > [Ameer] Here assumption was CACHE will be refreshed from the new mapping file > dropped a day earlier. But ti diddnt happend. Cache got refreshed in the > middle of the flow - not at the very beginnning. Thus few flowfile got old > value and later flowfile got new value. >> >> 3. then the flow started processing another CSV file right away line by line >> >> In above scenario, some lines in the CSV might get processed with the >> old mapping file. After 60s passed from 1, some other lines may get >> processed with the new mappings. Is that what you're seeing? >> > [Ameer] This is what is happening. But it shouldn't have - becuase new > mapping file was already existing before the next processing begin. It should > have refresh right at the start - as also suggested by the code of the > ReplaceTextWithMapping processor. >> >> BTW, please avoid posting the same question to users and dev at the >> same time. I've removed dev address. >> [Ameer] Got it. >> Thanks, >> Koji >> >> On Sat, Jul 20, 2019 at 3:08 AM Ameer Mawia <[email protected]> wrote: >> > >> > Correcting Typo. >> > >> > On Fri, Jul 19, 2019 at 2:03 PM Ameer Mawia <[email protected]> wrote: >> >> >> >> Guys, >> >> >> >> It seems that NIFI ReplaceTextWithMapping Processors has a BUG with >> >> Refreshing its Mapped file. We are using its functionality in PROD and >> >> getting odd behaviour. >> >> >> >> Our USAGE Scenario: >> >> >> >> We use NIFI primarily as a TRANSFORMATION Tool. >> >> Our flow involves: >> >> >> >> Getting a raw csv file. >> >> Split the file on per line basis: >> >> >> >> So from one source flowfile - we may have 10000 flowfile >> >> generated/splitted out. >> >> >> >> For each of the splitted flow file(flowfiles for individual lines) we >> >> perform transformation on the attributes. >> >> We merge these flowfiles back and write the Output file. >> >> >> >> >> >> As part of the transformation in Step#3, we do some mapping for one of >> >> the field in the csv. For this we use ReplaceTextWithMapping Processor. >> >> Also to note we update our mapping file just before starting our flow(ie. >> >> Step #1) >> >> >> >> Our Issue: >> >> >> >> We have noted for SAME key we get two DIFFERENT values in two different >> >> flowfiles. >> >> We noted that one of the value mapped, existed in an older Mapping file. >> >> So in essence: ReplaceTextWithMapping Processor didn't refresh its cash >> >> uptill certain time. And thus return the old value for few mapping file >> >> and then - once in the meanwhile it has refreshed it cache - returned new >> >> updated value. >> >> And this cause the issue? >> >> >> >> Question: >> >> >> >> Is this a known issue with ReplaceTextWithMapping Processor? >> >> If not how can I create an issue for this? >> >> How can I confirm this behaviour? >> >> >> >> Thanks, >> >> Ameer Mawia >> >> >> >> >> >> >> >> >> >> -- >> >> http://ca.linkedin.com/in/ameermawia >> >> Toronto, ON >> >> >> > >> > >> > -- >> > http://ca.linkedin.com/in/ameermawia >> > Toronto, ON >> > > > > > -- > http://ca.linkedin.com/in/ameermawia > Toronto, ON >
