Hi Prabhu, Same as ListHDFS, GetHTTP uses ETAG HTTP header, and if server returns NOT_MODIFIED(304), it doesn't create output FlowFile. The screenshot indicates that GetHTTP runs 61 times but it only creates output FlowFile once because it's not modified.
I believe that is what's happening. Thanks, Koji On Wed, May 24, 2017 at 2:30 PM, prabhu Mahendran <[email protected]> wrote: > Pierre, > > Thanks for your mail, > > I might try to list files over and over.So that may be problem i faced.I > just modified existing files in hdfs and then list those files using > ListHDFS. > > I could be list files in which same as well as last execution of a processor > that's may be problem. > > Many thanks > > > On Wed, May 24, 2017 at 10:45 AM, Pierre Villard > <[email protected]> wrote: >> >> Just a quick remark, the ListHDFS processor won't list files over and >> over, it'll only list new files since the last execution of the processor. >> Do you know if new files are generated in the directory your are listing? >> >> Screenshots of your configurations would definitely help. >> >> 2017-05-24 6:55 GMT+02:00 Joe Witt <[email protected]>: >>> >>> prabhu - can you please share screenshots and or logs showing that it >>> is running only once? >>> >>> Thanks >>> >>> On Wed, May 24, 2017 at 12:42 AM, prabhu Mahendran >>> <[email protected]> wrote: >>> > Aldrin, >>> > >>> > Thanks for your response. >>> > >>> > For GetHTTP ,I have checked to download different files even it could >>> > not >>> > run more than once. >>> > >>> > ListHDFS:I have used NiFi-1.2.0 in which configured these attributes >>> > "Hadoop >>> > Configuration Resources","Directory","RecurseSubDirectories" correctly >>> > for >>> > Hadoop-2.5.2.This runs only once not run again. >>> > >>> > Note: I have checked those processors in windows. >>> > >>> > Can you give any suggestion to solve this? >>> > >>> > Many Thanks, >>> > >>> > >>> > On Tue, May 23, 2017 at 7:03 PM, Aldrin Piri <[email protected]> >>> > wrote: >>> >> >>> >> For GetHTTP, this processor makes use of ETags[1] to prevent >>> >> downloading >>> >> the same resource repeatedly. I would speculate that this is the case >>> >> for >>> >> the resource you are specifying. >>> >> >>> >> As for ListHDFS, could you specify what version you are using? There >>> >> have >>> >> been some bugs concerning how this was handled. If the version is the >>> >> latest, could you please provide some more details in terms of >>> >> structure and >>> >> timestamps of the associated files causing the issue you are >>> >> describing? >>> >> >>> >> >>> >> [1] https://en.wikipedia.org/wiki/HTTP_ETag >>> >> >>> >> On Tue, May 23, 2017 at 3:22 AM, prabhu Mahendran >>> >> <[email protected]> wrote: >>> >>> >>> >>> Since i have faced some unexpected behaviour's in NiFi. >>> >>> >>> >>> I don't know why those processors which doesn't run more than once. >>> >>> >>> >>> >>> >>> For example: >>> >>> >>> >>> 1.GetHTTP: >>> >>> >>> >>> I have used GetHTTP processor for download files from "HTTP" Url. >>> >>> Initially i have scheduled 0 sec >>> >>> >>> >>> If i runs the processor it runs only once and not again run.Once copy >>> >>> the >>> >>> same processor and paste in the UI then click run that processor it >>> >>> again >>> >>> runs only once. >>> >>> >>> >>> If i scheduling it then also not runs more than once. >>> >>> >>> >>> >>> >>> >>> >>> 2.ListHDFS: >>> >>> >>> >>> I have configured local cluster properties in ListHDFS. >>> >>> >>> >>> i have 12 files in hdfs directory.If i runs without scheduling then >>> >>> it >>> >>> lists 12 files correctly and after scheduling it only returns 11 >>> >>> files >>> >>> without 1 file and not run after first time run >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> can anyone explain the behaviour of those processsors when 1 day >>> >>> scheduling in TimerDriven? >>> >> >>> >> >>> > >> >> >
