[ 
https://issues.apache.org/jira/browse/FLUME-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699198#comment-13699198
 ] 

Phil Scala commented on FLUME-2119:
-----------------------------------

I too see this issue at time, mostly from human error as someone wanted to 
spool a file and did not realize it was there already (in a completed state). I 
locally have a patch that I was working on to allow for this scenario, 
loosening  the spooled file source policies around this.  My current 
implemenation is a setting on the spooled directory source, called 
"useStrictSpooledFilePolicies".  In places where a  new 
IllegalStateException(message); was tehown, I log the error and check the 
setting value, thowing the exception when the setting is set to "true" (i.e. be 
strict).



Though one must realize this will lead to duplicate events stored in the sink 
store and so this needs to be used with caution.

I can submit my patch for this if other see value.

For an immediate work around -> use the delete policy in 1.4 set to 
"IMMEDIATE".  which would not save the .COMPLETED file.  this of course means 
you do not have any .COMPLETED files to use for any proof of spooling.


                
> duplicate files cause flume to enter irrecoverable state
> --------------------------------------------------------
>
>                 Key: FLUME-2119
>                 URL: https://issues.apache.org/jira/browse/FLUME-2119
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>            Reporter: Jonathan Cooper-Ellis
>
> If a spoolingdir receives FileA, after it is picked up by Flume and renamed 
> to FileA.COMPLETED placing another file of the same original name (FileA) 
> will cause Flume to log an IllegalStateException indefinitely. This is likely 
> due to Flume attempting to rename the second FileA to FileA.COMPLETED, but 
> finding that the file already exists.
> When Flume has entered this state, it can only be recovered by removing the 
> .COMPLETED file from the directory and restarting the agent.
> Log message looks like this:
> 02 Jul 2013 21:32:09,371 ERROR [pool-4-thread-1] 
> (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164) 
>  - Uncaught exception in Runnable
> java.lang.IllegalStateException: Serializer has been closed
>         at 
> org.apache.flume.serialization.LineDeserializer.ensureOpen(LineDeserializer.java:124)
>         at 
> org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:88)
>         at 
> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
>         at 
> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to