[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2021-06-01 Thread Kartik Mishra (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355238#comment-17355238
 ] 

Kartik Mishra commented on NIFI-3213:
-

Hi Team, I am on NiFi 1.12.1 version. If I got it right, it should be solved 
already. But we are facing this issue in 1.12.1. ListFile, ListFtp and ListSftp 
are always skiping the files with latest timestamp

> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>Priority: Major
> Fix For: 1.2.0
>
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-06-14 Thread Koji Kawamura (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048867#comment-16048867
 ] 

Koji Kawamura commented on NIFI-3213:
-

 [~bende] Sorry for my late response. As you concerned, this JIRA can list 
without waiting additional cycle, it doesn't go into that else statement. We 
shouldn't use System.nanoTime to compare with file timestamp as System.nanoTime 
uses arbitrary origin and differ from one JVM to another. 

Even before this JIRA is merged, filesystems those do not provide timestamps in 
milliseconds precision have had a problem that ListFile can miss some of the 
files those are written with the same timestamp in seconds precision. I created 
NIFI-4069 to address your concern and also to work with those filesystems with 
less accurate timestamp.

It'd be appreciated if you can take a look on NIFI-4069 and its PR. Let's 
discuss further at NIFI-4069.

> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
> Fix For: 1.2.0
>
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-05-25 Thread Bryan Bende (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025275#comment-16025275
 ] 

Bryan Bende commented on NIFI-3213:
---

I came across this JIRA while looking at a similar issue with ListHDFS, I'm 
wondering about a couple of things...

I believe the reason for the original logic was for the following scenario:
- file1 written with time1
- processor performs listing
- file2 written with time1

Since we are only tracking timestamps and not which files were listed, if we 
include file1 in the listing then we will miss file2 on the next execution 
because we are looking for things newer than time1, if we include it on both 
sides then we get file1 listed twice because we don't know we listed it the 
first time. So instead we were leaving it out and getting them both next time, 
which has the drawback of a delay, but won't miss anything or have duplicates.

With this change we are doing the following:

final long currentListingTimestamp = System.nanoTime();

Then later using that value:

else if (latestListingTimestamp >= currentListingTimestamp - LISTING_LAG_NANOS) 
{
 orderedEntries.remove(latestListingTimestamp);
}

What if the directory we are listing is a remote directory where the timestamps 
don't really correspond with NiFi's timestamps?

Is latestListingTimestamp in milliseconds and we are comparing against 
currentListingTimestamp in nano-seconds?

I'm concerned that we may never go into that else statement for cases where we 
were supposed to.



> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
> Fix For: 1.2.0
>
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872144#comment-15872144
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@trixpan Thank you for merging! closed.


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872145#comment-15872145
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user ijokarumawak closed the pull request at:

https://github.com/apache/nifi/pull/1335


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871895#comment-15871895
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user trixpan commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@ijokarumawak merged. However I forgot to add reference to the PR in the 
commit message. Would you mind closing it manually ?

I thank you in advance


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871887#comment-15871887
 ] 

ASF subversion and git services commented on NIFI-3213:
---

Commit 095c04eda0c604a02c51df085ba67847448224c0 in nifi's branch 
refs/heads/master from [~ijokarumawak]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=095c04e ]

NIFI-3213: ListFile do not skip obviously old files

Before this fix, files with the latest timestamp within a listing
iteration are always be held back one cycle no matter how old it is.

Signed-off-by: Andre F de Miranda 


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-17 Thread Andre F de Miranda (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871872#comment-15871872
 ] 

Andre F de Miranda commented on NIFI-3213:
--

NIFI-3213: ListFile do not skip obviously old files
Fixed TestListFile.testFilterAge to make it consistent.
It used to use the same last modified timestamp throughout the test
that is set at the beginning of the test.
It caused different test results because the meaning of `age1` to `age5`
can vary at the later part of the test, as the test also make use of
Thread.sleep.
This fix reset age variables and last modified timestamp of test input
files before executing next run, to ensure the meaning of age
variables to be consistent.

> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15871053#comment-15871053
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@trixpan Since the failing test has more Thread.sleep calls than before, 
the meaning of variables such as `age1` or `age2` became fragile at the later 
part of the test. I added another commit to fix the test case. Also, rebased it 
with the latest master just in case.

Thanks again for caching this test issue. Please review the test again, and 
let me know if you want me to squash commits.


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869791#comment-15869791
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@trixpan Thanks for reviewing and catching the unit test failure. Yes, I'd 
like to look at it closer. I will update the PR accordingly.


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869630#comment-15869630
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user trixpan commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@ijokarumawak my bad. I was just running a extra set of compilations and I 
noticed that under certain conditions there seems to be a race condition 
affecting 

```
Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.66 sec 
<<< FAILURE! - in org.apache.nifi.processors.standard.TestListFile
testFilterAge(org.apache.nifi.processors.standard.TestListFile)  Time 
elapsed: 1.212 sec  <<< FAILURE!
org.junit.ComparisonFailure: expected: but was:
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.nifi.processors.standard.TestListFile.testFilterAge(TestListFile.java:223)

```

Do you want to have a look at it?


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869463#comment-15869463
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user trixpan commented on the issue:

https://github.com/apache/nifi/pull/1335
  
LGTM merging


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2017-02-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869374#comment-15869374
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user trixpan commented on the issue:

https://github.com/apache/nifi/pull/1335
  
@ijokarumawak reviewing it


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2016-12-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753933#comment-15753933
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

Github user ijokarumawak commented on the issue:

https://github.com/apache/nifi/pull/1335
  
I believe the old behavior that always postpone to emit files with the 
latest timestamp is counter intuitive and can be problematic with some use 
cases if user would like to schedule it with longer run schedule. Please 
correct me if I'm missing any important purpose for this behavior.

I had to fix many unit test cases because those are written with an 
assumption that the latest file should be skipped. Again I believe those test 
cases became more natural and understandable by this PR, but I may be missing 
something.

Thanks for reviewing in advance!


> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-3213) ListFile always skips files with the latest timestamp in an iteration even if the files have existed a while ago

2016-12-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753922#comment-15753922
 ] 

ASF GitHub Bot commented on NIFI-3213:
--

GitHub user ijokarumawak opened a pull request:

https://github.com/apache/nifi/pull/1335

NIFI-3213: Do not skip obviously old files.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [x] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder?
- [x] Have you written or updated unit tests to verify your changes?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.

Before this fix, files with the latest timestamp within a listing
iteration are always be held back one cycle no matter how old it is.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijokarumawak/nifi nifi-3213

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/1335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1335


commit 9bfae4dda5f6e1fa37242207ee3b284da79dffaf
Author: Koji Kawamura 
Date:   2016-12-16T08:48:06Z

NIFI-3213: Do not skip obviously old files.

Before this fix, files with the latest timestamp within a listing
iteration are always be held back one cycle no matter how old it is.




> ListFile always skips files with the latest timestamp in an iteration even if 
> the files have existed a while ago
> 
>
> Key: NIFI-3213
> URL: https://issues.apache.org/jira/browse/NIFI-3213
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1
>Reporter: Koji Kawamura
>Assignee: Koji Kawamura
>
> NIFI-1484 add few lines of code to avoid files to be emitted if those have 
> the latest timestamp within an iteration of listing, because it may still be 
> written at the same time.
> While it doesn't affect much if ListFiles processor is scheduled with a short 
> period of time, such as few ms, but it does affect negatively if an user 
> scheduled it with longer run schedule such as "1 day" or with cron scheduler. 
> For example, user would expect to process list of files per daily basis. Even 
> if a file is saved few hours ago, the processor will skip this, because the 
> file has the latest timestamp within the iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)