Re: Review Request: Low throughput of FileChannel

Brock Noland Fri, 03 Aug 2012 05:38:46 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6329/#review9825
-----------------------------------------------------------



1) This would completely eliminate the guarantee FIleChannel makes. It sounds 
to me like like you want a disk spooling channel which does not make strict 
durability guarantees. This is not what FIleChannel does today. If you want 
faster throughput, increase batch size.
2) This is to eliminate corruption when we reach a full disk.  Also, writing 
over existing data is faster than writing the first time. I think we could 
improve this. I think we should allocate the length but not the data itself. 
This eliminates the metadata update when we write. Then we could just stop 
writing to disk when it's nearly full.
3) Have you done tests that show it's faster? Either way it should lead to 
write system call.

- Brock Noland


On Aug. 3, 2012, 9:39 a.m., Denny Ye wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6329/
> -----------------------------------------------------------
> 
> (Updated Aug. 3, 2012, 9:39 a.m.)
> 
> 
> Review request for Flume, Hari Shreedharan and Patrick Wendell.
> 
> 
> Description
> -------
> 
> Here is the description in code changes
> 1. Remove the 'FileChannel.force(false)'. Each commit from Source will invoke 
> this 'force' method. This method is too heavy for amounts of data comes. Each 
> 'force' action will be consume 50-500ms that it confirms data stored into 
> disk. Normally, OS will flush data from kernal buffer to disk asynchronously 
> with ms level latency. It may useless in each commit operation. Certainly, 
> data loss may occurs in server crash not process crash. Server crash is 
> infrequent.
> 2. Do not pre-allocate disk space. Disk doesn't need the pre-allocation.
> 3. Use 'RandomAccessFile.write()' to replace 'FileChannel.write()'. Both in 
> my test result and low-level instruction, the former is better than the latter
> 
> Here I posted three changes, and I would like to use thread-level cached 
> DirectByteBuffer to replace inner-heap ByteBuffer.allocate() (reuse 
> outer-heap memory to reduce time that copying from heap to kernal). I will 
> test this changes in next phase.
> 
> After tuning, throughput increasing from 5MB to 30MB
> 
> 
> This addresses bug FLUME-1423.
>     https://issues.apache.org/jira/browse/FLUME-1423
> 
> 
> Diffs
> -----
> 
>   
> trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
>  1363210 
> 
> Diff: https://reviews.apache.org/r/6329/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Denny Ye
> 
>

Re: Review Request: Low throughput of FileChannel

Reply via email to