Re: How to efficiently upload file chunks

'Martin Grotzke' via mechanical-sympathy Sat, 13 Jan 2018 01:55:27 -0800

Thanks for your hints, Avi and Chetan!

>From your suggestions (Avi) I found some more reading material regarding
mmap costs and followed approach 2 (avoiding mmap costs). This works nicely
for some time now.


Thanks again,
Martin

Chet L <[email protected]> schrieb am Fr., 12. Jan. 2018, 21:47:

> Martin,
>
> minor note: in future if you decide to checksum the chunks before writing
> them(chunks[data]+cksum[meta]) out then you will end up loading the NFS:Rd
> contents.
>
> Chetan
>
>
> On Sunday, September 10, 2017 at 6:08:48 AM UTC-7, Martin Grotzke wrote:
>>
>> Hi,
>>
>> TL;DR: my question covers the difference between MappedByteBuffer vs.
>> direct ByteBuffers when uploading chunks of a file from NFS.
>>
>> Details: I want to upload file chunks to some cloud storage. Input files
>> are several GB large (say s.th. between 1 and 100), accessed via NFS on
>> 64bit Linux/CentOS. An input file has to be split into chunks of ~ 1 to
>> 10 MB (a file has to be split by some index, i.e. I have a list of
>> byte-ranges for a file).
>>
>> I'm planning to use async-http-client (AHC) to upload file chunks via
>> `setBody(ByteBuffer)` [1].
>>
>> My two favourites for splitting the file into chunks (ByteBuffers) are
>> 1) FileChannel.map -> MappedByteBuffer
>> 2) FileChannel.read(ByteBuffer) -> a (pooled) direct ByteBuffer
>>
>> My understanding of 1) is, that the MappedByteBuffer would represent a
>> segment of virtual memory, so that the OS would even not have to load
>> the data from NFS (during mmap'ing, as long as the MappedByteBuffer is
>> not read). When AHC/netty writes the buffer to the output (socket)
>> channel, the OS/kernel loads data from NFS into the page cache and then
>> writes these pages to the network socket (and to be honest, I have no
>> clue how the NFS API works and how the kernel loads the file chunks).
>>
>> Is this understanding correct?
>>
>> My understanding of 2) is, that on FileChannel.read(ByteBuffer) the OS
>> would read data from NFS and copy it into the memory region backing the
>> direct ByteBuffer. When AHC/netty writes the ByteBuffer to the output
>> channel, the OS would copy data from the memory region to the network
>> socket.
>>
>> Is this understanding correct?
>>
>> Based on these assumptions, 1) should be _a bit_ more efficient than 2),
>> but not significantly. With 1) my concern is that it's not possible to
>> unmap the memory mapped file [2] and I have less control over native
>> memory usage. Therefore my preference currently is 1), using pooled
>> direct ByteBuffers.
>>
>> What do you think about this concern?
>>
>> Is there an even better way than 1) and 2) to achieve what I want?
>>
>> Thanks && cheers,
>> Martin
>>
>>
>> [1]
>>
>> https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/RequestBuilderBase.java#L390
>> [2] http://bugs.java.com/view_bug.do?bug_id=4724038
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: How to efficiently upload file chunks

Reply via email to