Yes, I figured we might as well match stripe_width to chunk_size *
get_data_chunk_count().
-Sam

On Mon, Feb 3, 2014 at 3:35 AM, Loic Dachary <[email protected]> wrote:
> Hi Sam,
>
> The argument to get_chunk_size is the stripe width, named object_size because 
> the API knows nothing about stripes, it is a concept for the caller to 
> implement. Say you have a desired chunk size in mind, you would:
>
>    object_size = desired_chunk_size * get_data_chunk_count()
>    actual_chunk_size = get_chunk_size(object_size)
>
> If you have a desired stripe width / object size in mind you would:
>
>    object_size = desired_stripe_width
>    chunk_size = get_chunk_size(object_size)
>
> Following Andreas suggestions, controlling the size of the actual chunk is a 
> matter of tweaking the alignment constraints via the erasure code plugin 
> parameters.
>
> Cheers
>
> On 02/02/2014 23:45, Samuel Just wrote:
>> I assume we will use get_chunksize(desired_chunksize) *
>> get_data_chunk_count() on the mon to define the stripe width (the size
>> of the buffer which will be presented to the plugin for encoding) for
>> the pool.  At the moment, get_chunksize(4*(2<<10)) *
>> get_data_chunk_count() = 393216 using the jerasure plugin where
>> get_data_chunk_count() = 4.  This seems a bit big?
>> -Sam
>>
>> On Sun, Feb 2, 2014 at 8:18 AM, Andreas Joachim Peters
>> <[email protected]> wrote:
>>> Hi Loic et.al.
>>>
>>> I think there is now some confusion about chunk_size, alignment, packetsize 
>>> and the stripe_size to be used upstream.
>>>
>>> Algorithms with a bit-matrix require that the size per device is a multiple 
>>> of (packetsize*w). Moreover the size per device and packetsize itself must 
>>> be a multiple of sizeof(long/int). For other algorithms  you can assume the 
>>> same with packetsize=1.
>>>
>>> packetsize and w influence  the performance and too small stripe_size on 
>>> top will have negative performance effects due to the preparation of 
>>> bufferlist, internal buffer checks and more loops to execute for the same 
>>> amount of data. We can also do some measurement for this but the current 
>>> benchmark would probably not reflect this, since it measures the 
>>> algorithmic part not the bufferlist preparation part.
>>>
>>> If you want to define a stripe_size it has to be a multiple of the value 
>>> returned by get_chunksize  and possibly it is a large multiple but in total 
>>> not larger than processor caches. The plugin can not define the 
>>> stripe_size, it defines only the alignment to be used for stripe_size and 
>>> stripe_size is defined outside the plugin which maybe complicates the 
>>> understanding. We should carefully check once more the Jerasure alignment 
>>> requirements and our current implementation.
>>>
>>> To get rid of the platform dependency we could put a generic alignment 
>>> requirement that chunksize has to be also 64-byte aligned.
>>>
>>> Cheers Andreas.
>>>
>>>
>>>
>>>
>>> ________________________________________
>>> From: Loic Dachary [[email protected]]
>>> Sent: 02 February 2014 16:15
>>> To: Samuel Just
>>> Cc: Ceph Development; Andreas Joachim Peters
>>> Subject: controlling erasure code chunk size
>>>
>>> [cc' ceph-devel]
>>>
>>> Hi Sam,
>>>
>>> Here is how chunks are expected to be aligned:
>>>
>>> https://github.com/ceph/ceph/blob/4c4e1d0d470beba7690d1c0e39bfd1146a25f465/src/osd/ErasureCodePluginJerasure/ErasureCodeJerasure.cc#L365
>>>
>>>  unsigned alignment = k*w*packetsize*sizeof(int);
>>>   if ( ((w*packetsize*sizeof(int))%LARGEST_VECTOR_WORDSIZE) )
>>>     alignment = k*w*packetsize*LARGEST_VECTOR_WORDSIZE;
>>>   return alignment;
>>>
>>> If you are going to encode small objects, it may very well lead to 
>>> oversized chunks if packetsize is large. At the moment the default is 3072
>>>
>>> https://github.com/ceph/ceph/blob/4c4e1d0d470beba7690d1c0e39bfd1146a25f465/src/common/config_opts.h#L406
>>>
>>> A value I picked when experimenting with 1MB objects encoding ( 
>>> http://dachary.org/?p=2594 ).
>>>
>>> I'm not entirely sure why the alignment is calculated the way it is. 
>>> Andreas certainly has a better understanding on this topic.
>>>
>>> Cheers
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to