[ 
https://issues.apache.org/jira/browse/FLUME-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xingxing Di updated FLUME-3308:
-------------------------------
    Description: 
We found TailDirSource has a critical issue  while writing position json file:

1.In the "process" method, existingInodes.clear() then  existingInodes.addAll() 
is not  safe, in positionWriter thread will read an empty existingInodes  with 
no lock ( just after clear() executed and addAll() method not called yet), this 
will cause an empty json string.

2.The FileWriter is not atomic, the position json file is over 5M in our 
case(which is big), after we fix the above issue, we still read an empty 
position json file occasionally.

If flume was restarted and read an empty position json file, flume will tail 
all files from begining, which is critical!

So we make a little change : 
 # Add lock for existingInodes list,  and in positionWriter thread  we make a 
copy of existingInodes everytime.
 # We replace FileWriter with an AtomicFileWriter.

Later i will make a PR to share this.

 

  was:
We found TailDirSource has a critical issue  while writing position json file:

1.In the "process" method, existingInodes.clear() then  existingInodes.addAll() 
is not  safe, in positionWriter thread will read an empty existingInodes  with 
no lock ( just after clear() executed and addAll() method not called yet), this 
will cause an empty json string.

2.The FileWriter is not atomic, the position json file is over 5M in our 
case(which is big), after we fix the above issue, we still read an empty 
position json file occasionally.

If flume was restarted and read an empty position json file, flume will tail 
all files from begining, which is critical!

So we make a little change : 
 # Add lock for existingInodes list,  and in positionWriter thread  we make a 
copy of existingInodes everytime.
 # We replace FileWriter with an AtomicFileWriter.

Later i will make PR to share this.

 


> TailDirSource will write an empty position json file
> ----------------------------------------------------
>
>                 Key: FLUME-3308
>                 URL: https://issues.apache.org/jira/browse/FLUME-3308
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.8.0
>         Environment: Flume 1.8, TailDirSource
>            Reporter: Xingxing Di
>            Priority: Critical
>
> We found TailDirSource has a critical issue  while writing position json file:
> 1.In the "process" method, existingInodes.clear() then  
> existingInodes.addAll() is not  safe, in positionWriter thread will read an 
> empty existingInodes  with no lock ( just after clear() executed and addAll() 
> method not called yet), this will cause an empty json string.
> 2.The FileWriter is not atomic, the position json file is over 5M in our 
> case(which is big), after we fix the above issue, we still read an empty 
> position json file occasionally.
> If flume was restarted and read an empty position json file, flume will tail 
> all files from begining, which is critical!
> So we make a little change : 
>  # Add lock for existingInodes list,  and in positionWriter thread  we make a 
> copy of existingInodes everytime.
>  # We replace FileWriter with an AtomicFileWriter.
> Later i will make a PR to share this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to