Nice work! Regarding the performance chart and testing so far, it's nice to know that the cpu overhead is well-bounded and these small tests likely worked well for simply making sure everything worked, but I wouldn't spend much/any time on this type of testing going forward, since these microbenchmarks only show cached performance -- the compressed numbers will basically always look like a net loss here (albeit it looks like a small one, which is good) -- the real numbers of interest are going to be performance of uncached benchmarks / benchmarks that cause a lot of real disk i/o. As you make it stable I would move onto things like fsstress, blogbench, bonnie, etc.
If the code is stable enough I would be interested to hear what the performance delta is between a pair of dd if=/dev/zero bs=64k count=5000 or similar (as long as its much bigger than RAM) with zero-compression on vs off. In theory it should look similar to the delta between cached io and uncached io. Sam On Sun, Aug 4, 2013 at 1:55 PM, Daniel Flores <[email protected]> wrote: > Hello everyone, > here is my report for week 7. > > This week I had to create a new VM for DragonFly. This new VM has > different settings and works faster than the previous one. So, since all my > work and tests will be done on that new VM, it won't be possible to > directly compare new results with the results obtained in previous tests on > old VM. > > Now, as for work done this week, the code was cleaned up significantly and > optimized a bit too. This affected mostly write path, since most of new > code was there. More specifically, the write path now looks like this: > > We have hammer2_write_file() function that contains all the code that is > shared among 3 possible options for write path – no compression, > zero-cheking and LZ4 compression. At the end of the function where paths > start to differ depending on selected option, it simply determines the > option and calls one of 3 functions: hammer2_compress_and_write() > (corresponds to LZ4 compression), hammer2_zero_check_and_write() > (corresponds to zero-checking option) and hammer2_just_write() (no > compression or zero-checking). Those functions do everything necessary to > finish the write path. > > hammer2_just_write() mostly contains the code that was previously in the > end of hammer2_write_file() function. > > hammer2_zero_check_and_write() is a very simple function that checks if > the block to be written contains only zeros with a specific function called > not_zero_filled_block() and calls, if necessary, another function called > zero_check() that deals with the zero-filled block. If the block is not > zero-filled, the function calls hammer2_just_write(). > > hammer2_compress_and_write() is the most complex function that performs > the compression and then writes the block, the compressed version if the > compression was successful and the original version if it wasn't. It also > uses not_zero_filled_block() and zero_check() for zero-filled block case. > > There are also small improvements, such as that now we use > obcache_create() instead of obcache_create_simple(). > > What I'll do now is exhaustively test the code to ensure that it is > stable. Right now it is not, because we still have a certain bug that > provokes file corruption while reading and the system crash under certain > circumstances. I'll be working on fixing that next week. Also there are a > couple of enhancements for write path such as detecting the incompressible > files and not trying to compress them on which I'll be working as well. I > also expect that, probably, some other bugs will be found in process of > testing. > > Now a bit on tests... Earlier this week I was asked to test the > performance on small files. The testing methodology was exactly the same as > the one I employed in tests from previous week's report. For testing I used > 5 files in total: > > 1 .jpg (incompressible) – roughly 62KB In size. > 1 small log file (perfectly compressible) – 62KB in size. > 1 .png (incompressible) – roughly 2KB in size. > 1 very small log file (perfectly compressible) – 2 KB in size. > 1 even smaller log file – 512B in size. I didn't use an incompressible > file, because all files of that size or smaller are embedded directly into > an inode, so their path is the same. > > For the group test, the same files were copied 20 times per test. > > The results are summarized in this table [1]. > > Basically, it looks like for such small files there is no detectable > difference in performance. It should be noted that the average seek time on > modern hard drives is about 0.009 s, so at this rate other factors are more > important for performance than the path used. It also should be noted that > currently the write path with compression tries to compress the whole > logical block even if the file is smaller than this block, but this doesn't > seem to affect the performance on a scale of single file. > > On other hand, when the total size is large enough, like 2.5MB in this > case, it appears that the difference starts to be perceivable. > > My code is available, as usually, in my leaf, branch “hammer2_LZ4” [2]. > I'll appreciate any comments, feedback and criticism. > > > Daniel > > [1] > http://leaf.dragonflybsd.org/~iostream/performance_table_small_files.html > [2] git://leaf.dragonflybsd.org/~iostream/dragonfly.git > >
