If you want 4k stripe_size, you have to configure the cauchy plugin with w=8 packetsize=128 for a k=4 configuration.
For w=(multiple of 8) we could probably skip the (*sizeof(int)) and get the chunksize factor 4 down ... Loic we should check if this is ok with the Jerasure implementation .... I wonder if we should have 'packetsize' as a plugin parameter or we should just adjust the packetsize based on the desired chunk_size to get it close. Cheers Andreas. ________________________________________ From: Samuel Just [[email protected]] Sent: 02 February 2014 23:45 To: Andreas Joachim Peters Cc: Loic Dachary; Ceph Development Subject: Re: controlling erasure code chunk size I assume we will use get_chunksize(desired_chunksize) * get_data_chunk_count() on the mon to define the stripe width (the size of the buffer which will be presented to the plugin for encoding) for the pool. At the moment, get_chunksize(4*(2<<10)) * get_data_chunk_count() = 393216 using the jerasure plugin where get_data_chunk_count() = 4. This seems a bit big? -Sam On Sun, Feb 2, 2014 at 8:18 AM, Andreas Joachim Peters <[email protected]> wrote: > Hi Loic et.al. > > I think there is now some confusion about chunk_size, alignment, packetsize > and the stripe_size to be used upstream. > > Algorithms with a bit-matrix require that the size per device is a multiple > of (packetsize*w). Moreover the size per device and packetsize itself must be > a multiple of sizeof(long/int). For other algorithms you can assume the same > with packetsize=1. > > packetsize and w influence the performance and too small stripe_size on top > will have negative performance effects due to the preparation of bufferlist, > internal buffer checks and more loops to execute for the same amount of data. > We can also do some measurement for this but the current benchmark would > probably not reflect this, since it measures the algorithmic part not the > bufferlist preparation part. > > If you want to define a stripe_size it has to be a multiple of the value > returned by get_chunksize and possibly it is a large multiple but in total > not larger than processor caches. The plugin can not define the stripe_size, > it defines only the alignment to be used for stripe_size and stripe_size is > defined outside the plugin which maybe complicates the understanding. We > should carefully check once more the Jerasure alignment requirements and our > current implementation. > > To get rid of the platform dependency we could put a generic alignment > requirement that chunksize has to be also 64-byte aligned. > > Cheers Andreas. > > > > > ________________________________________ > From: Loic Dachary [[email protected]] > Sent: 02 February 2014 16:15 > To: Samuel Just > Cc: Ceph Development; Andreas Joachim Peters > Subject: controlling erasure code chunk size > > [cc' ceph-devel] > > Hi Sam, > > Here is how chunks are expected to be aligned: > > https://github.com/ceph/ceph/blob/4c4e1d0d470beba7690d1c0e39bfd1146a25f465/src/osd/ErasureCodePluginJerasure/ErasureCodeJerasure.cc#L365 > > unsigned alignment = k*w*packetsize*sizeof(int); > if ( ((w*packetsize*sizeof(int))%LARGEST_VECTOR_WORDSIZE) ) > alignment = k*w*packetsize*LARGEST_VECTOR_WORDSIZE; > return alignment; > > If you are going to encode small objects, it may very well lead to oversized > chunks if packetsize is large. At the moment the default is 3072 > > https://github.com/ceph/ceph/blob/4c4e1d0d470beba7690d1c0e39bfd1146a25f465/src/common/config_opts.h#L406 > > A value I picked when experimenting with 1MB objects encoding ( > http://dachary.org/?p=2594 ). > > I'm not entirely sure why the alignment is calculated the way it is. Andreas > certainly has a better understanding on this topic. > > Cheers > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
