Seth Kelly created CAMEL-6936:
---------------------------------

             Summary: FTP route with idempotent repo does not detect modified 
files
                 Key: CAMEL-6936
                 URL: https://issues.apache.org/jira/browse/CAMEL-6936
             Project: Camel
          Issue Type: Bug
          Components: camel-ftp
    Affects Versions: 2.12.1
            Reporter: Seth Kelly


Per my forum post:
http://camel.465427.n5.nabble.com/inProgressRepository-Not-clearing-for-items-in-idempotentRepository-td5742613.html

I'm attempting to consume messages from an FTP server using an idempotent 
repository to ensure that I do not re-download a file unless it has been 
modified. 

Here is my (quite simple) camel configuration: 
{code}
        <beans:bean id="downloadRepo" 
class="org.apache.camel.processor.idempotent.FileIdempotentRepository" >
                <beans:property name="fileStore" value="/tmp/.repo.txt"/>
                <beans:property name="cacheSize" value="25000"/>
                <beans:property name="maxFileStoreSize" value="1000000"/>
        </beans:bean>

        <camelContext trace="true" 
xmlns="http://camel.apache.org/schema/spring";>
                <endpoint id="myFtpEndpoint" 
uri="ftp://me@localhost?password=****&binary=true&recursive=true&consumer.delay=15000&readLock=changed&passiveMode=true&noop=true&idempotentRepository=#downloadRepo&idempotentKey=$simple{file:name}-$simple{file:modified}";
 />
                <endpoint id="myFileEndpoint" uri="file:///tmp/files"/>

        <route>
            <from uri="ref:myFtpEndpoint" />
            <to uri="ref:myFileEndpoint" />
        </route>
{code}

When I start my application for the first time, all files are correctly 
downloaded from the FTP server and stored in the target directory, as well as 
recorded in the idempotent repo. 

When I restart my application, all files are correctly detected as being in the 
idempotent repo already on the first poll of the FTP server, and are not 
re-downloaded: 

13-11-04 16:52:10,811 TRACE [Camel (camel-1) thread #0 - ftp://me@localhost] 
org.apache.camel.component.file.remote.FtpConsumer: FtpFile[name=test1.txt, 
dir=false, file=true] 
2013-11-04 16:52:10,811 TRACE [Camel (camel-1) thread #0 - ftp://me@localhost] 
org.apache.camel.component.file.remote.FtpConsumer: This consumer is idempotent 
and the file has been consumed before. Will skip this file: 
RemoteFile[test1.txt] 

However, on all subsequent polls to the FTP server the idempotent check is 
short-circuited because the file is "in progress": 

2013-11-04 16:53:10,886 TRACE [Camel (camel-1) thread #0 - ftp://me@localhost] 
org.apache.camel.component.file.remote.FtpConsumer: FtpFile[name=test1.txt, 
dir=false, file=true]
2013-11-04 16:53:10,886 TRACE [Camel (camel-1) thread #0 - ftp://me@localhost] 
org.apache.camel.component.file.remote.FtpConsumer: Skipping as file is already 
in progress: test1.txt 

I am using camel-ftp:2.11.1 (also observing same behavior with 2.12.1)  When I 
inspect the source code I notice two interesting things. 
First, the GenericFileConsumer check that determines whether a file is already 
inProgress which is called from isValidFile() always adds the file to the 
inProgressRepository: 
{code}
    protected boolean isInProgress(GenericFile<T> file) { 
        String key = file.getAbsoluteFilePath(); 
        return !endpoint.getInProgressRepository().add(key); 
    } 
{code}

Second, if a file is determined to match an entry already present in the 
idempotent repository it is discarded (GenericFileConsumer.isValidFile() 
returns false).  This means it is never published to an exchange, and thus 
never reaches the code which would remove it from the inProgressRepository. 

Since the inProgress check happens before the Idempotent Check, we will always 
short circuit after we get into the inprogress state, and the file will never 
actually be checked again. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to