Re: How to efficiently upload file chunks

Chet L Fri, 12 Jan 2018 12:47:55 -0800

Martin,

minor note: in future if you decide to checksum the chunks before writing 
them(chunks[data]+cksum[meta]) out then you will end up loading the NFS:Rd 
contents.


Chetan


On Sunday, September 10, 2017 at 6:08:48 AM UTC-7, Martin Grotzke wrote:
>
> Hi, 
>
> TL;DR: my question covers the difference between MappedByteBuffer vs. 
> direct ByteBuffers when uploading chunks of a file from NFS. 
>
> Details: I want to upload file chunks to some cloud storage. Input files 
> are several GB large (say s.th. between 1 and 100), accessed via NFS on 
> 64bit Linux/CentOS. An input file has to be split into chunks of ~ 1 to 
> 10 MB (a file has to be split by some index, i.e. I have a list of 
> byte-ranges for a file). 
>
> I'm planning to use async-http-client (AHC) to upload file chunks via 
> `setBody(ByteBuffer)` [1]. 
>
> My two favourites for splitting the file into chunks (ByteBuffers) are 
> 1) FileChannel.map -> MappedByteBuffer 
> 2) FileChannel.read(ByteBuffer) -> a (pooled) direct ByteBuffer 
>
> My understanding of 1) is, that the MappedByteBuffer would represent a 
> segment of virtual memory, so that the OS would even not have to load 
> the data from NFS (during mmap'ing, as long as the MappedByteBuffer is 
> not read). When AHC/netty writes the buffer to the output (socket) 
> channel, the OS/kernel loads data from NFS into the page cache and then 
> writes these pages to the network socket (and to be honest, I have no 
> clue how the NFS API works and how the kernel loads the file chunks). 
>
> Is this understanding correct? 
>
> My understanding of 2) is, that on FileChannel.read(ByteBuffer) the OS 
> would read data from NFS and copy it into the memory region backing the 
> direct ByteBuffer. When AHC/netty writes the ByteBuffer to the output 
> channel, the OS would copy data from the memory region to the network 
> socket. 
>
> Is this understanding correct? 
>
> Based on these assumptions, 1) should be _a bit_ more efficient than 2), 
> but not significantly. With 1) my concern is that it's not possible to 
> unmap the memory mapped file [2] and I have less control over native 
> memory usage. Therefore my preference currently is 1), using pooled 
> direct ByteBuffers. 
>
> What do you think about this concern? 
>
> Is there an even better way than 1) and 2) to achieve what I want? 
>
> Thanks && cheers, 
> Martin 
>
>
> [1] 
>
> https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/RequestBuilderBase.java#L390
>  
> [2] http://bugs.java.com/view_bug.do?bug_id=4724038 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: How to efficiently upload file chunks

Reply via email to