I'll make a table instead... Page Size Write Throughput (MB/s) 64KB 1.61 128KB 3.14 256KB 5.69 512KB 9.52 1024KB 17.41 2048KB 28.29 4096KB 41.63 8192KB 56.26 16384KB 72.11
On Fri, Mar 30, 2018 at 3:25 PM, abdullah alamoudi <[email protected]> wrote: > Am I the only one who didn't get the image in the email? > > > On Mar 30, 2018, at 3:22 PM, Chen Luo <[email protected]> wrote: > > > > An update on this issue. It seems this speed-up comes from simply > increasing the log page size (and I've submitted a patch > https://asterix-gerrit.ics.uci.edu/#/c/2553/ <https://asterix-gerrit.ics. > uci.edu/#/c/2553/>). > > > > I also wrote a simple program to test the write throughput w.r.t. > different page sizes: > > for (int i = 0; i < numPages; i++) { > > byteBuffer.rewind(); > > while (byteBuffer.hasRemaining()) { > > totalBytesWriten += channel.write(byteBuffer); > > } > > channel.force(false); > > } > > } > > It also confirms that varying page size can have a big impact on the > disk throughput (even it's sequential I/Os). The experiment result on one > of our sensorium node is as follows: > > > > > > > > > > On Tue, Mar 27, 2018 at 5:19 PM, Chen Luo <[email protected] <mailto: > [email protected]>> wrote: > > Hi Devs, > > > > Recently I was doing ingestion experiments, and found out our default > log buffer size (1MB = 8 pages * 128KB page size) is too small, and > negatively impacts the ingestion performance. The short conclusion is that > by simply increasing the log buffer size (e.g., to 32MB), I can improve the > ingestion performance by 50% ~ 100% on a single node sensorium machine as > shown follows. > > > > > > The detailed explanation of log buffer size is as follows. Right now we > have a background LogFlusher thread which continuously forces log records > to disk. When the log buffer is full, writers are blocked to wait for log > buffer space. However, when setting the log buffer size, we have to > consider the LSM operations as well. The memory component is first filled > up with incoming records at a very high speed, which is then flushed to > disk at a relatively low speed. If the log buffer size is small, ingestion > is very likely to be blocked by the LogFlusher when filling up the memory > component. This blocking is wasted since quite often flush/merge is idle. > However, when the log buffer is relatively large, the LogFlush can catch up > itself when ingestion is blocked by flush/merge, which is not harmful since > there is ongoing LSM I/O operations. > > > > I didn't know how large the log buffer size should be right now (as it > depends on various factors), but our default value 1MB is very likely too > small to cause blocking during normal ingestion time. Just let you know and > be aware of this parameter when you measure ingestion performance... > > > > Best regards, > > Chen Luo > > > > > >
