> On Sept. 25, 2012, 10:15 p.m., Hari Shreedharan wrote:
> > Why not do this in the LogFile class, in the put/take/commit methods? You 
> > could simply put in a preconditions check in that method right? That way we 
> > can avoid checking for this every second, rather do it when an actual 
> > operation is about to happen.
> 
> Brock Noland wrote:
>     My hope was to have a common pattern that protects both checkpoint and 
> log writes.  The other advantage is that we can easily be sure we don't roll 
> each time someone wants to write and it fails. The patch today doesn't 
> actually check the checkpointDir like it should.
>     
>     I am not +1 on this patch yet. I think we should look at the 
> getUsableSpace api to see how expensive it is and if it can be called in the 
> LogWriter then it might make sense to do so. If we do this, we'd want to be 
> sure we don't roll each time the write fails.

How does this affect the checkpoint? The checkpoint file is fixed size. All the 
other files are pretty small and not likely to have varying sizes either. So I 
don't think it is much of a concern in that case. I feel the main concern is 
for the log files because it will cause indefinite rolling, eventually causing 
several 0 byte log files, if the disk is full.


- Hari


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7267/#review11908
-----------------------------------------------------------


On Sept. 25, 2012, 6:48 p.m., Brock Noland wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7267/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2012, 6:48 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> Disables writing to the log when there is less than 50MB free in any of the 
> log directories. 50MB is not configurable but it seems like a trivial minimum 
> amount to support considering most disks are now several terabytes.
> 
> It might make sense to make this configurable and increase the default to 
> 100MB or something?
> 
> Also changes the background worker to ensure this the space available is 
> checked once per second.
> 
> 
> This addresses bug FLUME-1609.
>     https://issues.apache.org/jira/browse/FLUME-1609
> 
> 
> Diffs
> -----
> 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
>  64725dd 
> 
> Diff: https://reviews.apache.org/r/7267/diff/
> 
> 
> Testing
> -------
> 
> Manual testing is required as we do not have a file system abstraction.
> 
> To test this, I first created a 100MB in memory file system.
> 
> I then filled the file system to ensure that file channel would refuse to 
> replay with the file system full.
> 
> Creating the big file:
> {noformat}
> # dd if=/dev/zero of=/mnt/tmpfs/bigfile
> dd: writing to `/mnt/tmpfs/bigfile': No space left on device
> 204393+0 records in
> 204392+0 records out
> 104648704 bytes (105 MB) copied, 13.6317 seconds, 7.7 MB/s
> {noformat}
> 
> The channel refused to start:
> {noformat}
> 2012-09-25 13:13:51,111 (lifecycleSupervisor-1-0) [ERROR - 
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:269)] Failed 
> to start the file channel [channel=channel1]
> java.lang.IllegalStateException: No available free space
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>         at org.apache.flume.channel.file.Log.replay(Log.java:285)
>         at 
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:258)
>         at 
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> 
> 
> I then deleted the file, stopped the channel and created a 40MB file. In this 
> case, the agent was going to start, do work, and then after some period of 
> time the file system would become full. I expected to see an error message 
> that this occured:
> 
> Creating the file:
> {noformat}
> # dd if=/dev/zero of=/mnt/tmpfs/bigfile count=80000
> {noformat}
> 
> The channel stopped allowing writes:
> {noformat}
> 2012-09-25 13:17:40,572 (Log-BackgroundWorker-channel1) [ERROR - 
> org.apache.flume.channel.file.Log.checkFreeSpace(Log.java:898)] Log is being 
> disabled because [/tmp/flume/data1-1, /tmp/flume/data1-2, /tmp/flume/data1-3] 
> have exceeded their available space. [channel=channel1]
> 2012-09-25 13:17:40,573 (Log-BackgroundWorker-channel1) [ERROR - 
> org.apache.flume.channel.file.Log.checkFreeSpace(Log.java:905)] Directory 
> /tmp/flume/data1-1 has 51179520 bytes available which is less than the 
> required minimum of 52428800 bytes
> 2012-09-25 13:17:40,575 (Log-BackgroundWorker-channel1) [ERROR - 
> org.apache.flume.channel.file.Log.checkFreeSpace(Log.java:905)] Directory 
> /tmp/flume/data1-2 has 51179520 bytes available which is less than the 
> required minimum of 52428800 bytes
> 2012-09-25 13:17:40,576 (Log-BackgroundWorker-channel1) [ERROR - 
> org.apache.flume.channel.file.Log.checkFreeSpace(Log.java:905)] Directory 
> /tmp/flume/data1-3 has 51179520 bytes available which is less than the 
> required minimum of 52428800 bytes
> 2012-09-25 13:17:40,581 
> (PollableSourceRunner-SequenceGeneratorSource-source1-2) [ERROR - 
> org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:156)]
>  Unhandled exception, logging and sleeping for 5000ms
> java.lang.IllegalStateException: No available free space
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>       at org.apache.flume.channel.file.Log.rollback(Log.java:503)
>       at 
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doRollback(FileChannel.java:530)
>       at 
> org.apache.flume.channel.BasicTransactionSemantics.rollback(BasicTransactionSemantics.java:168)
>       at 
> org.apache.flume.channel.ChannelProcessor.processEvent(ChannelProcessor.java:269)
>       at 
> org.apache.flume.source.SequenceGeneratorSource.process(SequenceGeneratorSource.java:68)
>       at 
> org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:139)
>       at java.lang.Thread.run(Thread.java:662)
> {noformat}
> 
> I then deleted the file and expected the channel to continue working after a 
> period of time. It did.
> 
> {noformat}
> 2012-09-25 13:18:26,028 (Log-BackgroundWorker-channel1) [INFO - 
> org.apache.flume.channel.file.Log.checkFreeSpace(Log.java:893)] Log was 
> disabled because a log directory has too little free space. However, the 
> space has cleared and the log will be enabled. [channel=channel1]
> ...
> 2012-09-25 13:18:37,179 
> (PollableSourceRunner-SequenceGeneratorSource-source1-2) [DEBUG - 
> org.apache.flume.channel.file.LogFile$Writer.preallocate(LogFile.java:253)] 
> Preallocating at position 5242788
> 2012-09-25 13:18:37,189 
> (PollableSourceRunner-SequenceGeneratorSource-source1-1) [INFO - 
> org.apache.flume.channel.file.Log.roll(Log.java:736)] Roll start 
> /tmp/flume/data1-2
> 2012-09-25 13:18:37,190 
> (PollableSourceRunner-SequenceGeneratorSource-source1-1) [INFO - 
> org.apache.flume.channel.file.LogFile$Writer.<init>(LogFile.java:138)] Opened 
> /tmp/flume/data1-2/log-5
> 2012-09-25 13:18:37,192 
> (PollableSourceRunner-SequenceGeneratorSource-source1-1) [INFO - 
> org.apache.flume.channel.file.LogFile$Writer.close(LogFile.java:236)] Closing 
> /tmp/flume/data1-2/log-2
> 2012-09-25 13:18:37,193 
> (PollableSourceRunner-SequenceGeneratorSource-source1-2) [DEBUG - 
> org.apache.flume.channel.file.LogFile$Writer.preallocate(LogFile.java:253)] 
> Preallocating at position 0
> 2012-09-25 13:18:37,193 
> (PollableSourceRunner-SequenceGeneratorSource-source1-1) [INFO - 
> org.apache.flume.channel.file.Log.roll(Log.java:751)] Roll end
> {noformat} 
> 
> 
> Thanks,
> 
> Brock Noland
> 
>

Reply via email to