Martin, minor note: in future if you decide to checksum the chunks before writing them(chunks[data]+cksum[meta]) out then you will end up loading the NFS:Rd contents.
Chetan On Sunday, September 10, 2017 at 6:08:48 AM UTC-7, Martin Grotzke wrote: > > Hi, > > TL;DR: my question covers the difference between MappedByteBuffer vs. > direct ByteBuffers when uploading chunks of a file from NFS. > > Details: I want to upload file chunks to some cloud storage. Input files > are several GB large (say s.th. between 1 and 100), accessed via NFS on > 64bit Linux/CentOS. An input file has to be split into chunks of ~ 1 to > 10 MB (a file has to be split by some index, i.e. I have a list of > byte-ranges for a file). > > I'm planning to use async-http-client (AHC) to upload file chunks via > `setBody(ByteBuffer)` [1]. > > My two favourites for splitting the file into chunks (ByteBuffers) are > 1) FileChannel.map -> MappedByteBuffer > 2) FileChannel.read(ByteBuffer) -> a (pooled) direct ByteBuffer > > My understanding of 1) is, that the MappedByteBuffer would represent a > segment of virtual memory, so that the OS would even not have to load > the data from NFS (during mmap'ing, as long as the MappedByteBuffer is > not read). When AHC/netty writes the buffer to the output (socket) > channel, the OS/kernel loads data from NFS into the page cache and then > writes these pages to the network socket (and to be honest, I have no > clue how the NFS API works and how the kernel loads the file chunks). > > Is this understanding correct? > > My understanding of 2) is, that on FileChannel.read(ByteBuffer) the OS > would read data from NFS and copy it into the memory region backing the > direct ByteBuffer. When AHC/netty writes the ByteBuffer to the output > channel, the OS would copy data from the memory region to the network > socket. > > Is this understanding correct? > > Based on these assumptions, 1) should be _a bit_ more efficient than 2), > but not significantly. With 1) my concern is that it's not possible to > unmap the memory mapped file [2] and I have less control over native > memory usage. Therefore my preference currently is 1), using pooled > direct ByteBuffers. > > What do you think about this concern? > > Is there an even better way than 1) and 2) to achieve what I want? > > Thanks && cheers, > Martin > > > [1] > > https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/RequestBuilderBase.java#L390 > > [2] http://bugs.java.com/view_bug.do?bug_id=4724038 > > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
