On 13/12/2013 16:47, Andreas Joachim Peters wrote:> Hi Loic, 
> 
> I (re-)pushed/fixed wip-bpc-01 in my GIT repository.
> 

There still seem to be an issue as https://github.com/ceph/ceph/pull/740 show 
to be unmergeable ( if such a word exists ;-).

> There is one commit of general interest to 'galois.c' which gives me a factor 
> 1.5 speed improvement (I exchanged the region XOR loop with vector operations 
> if available) in the Jerasure code base.
> 

Great.

> I have also replaced the parity implementation to use SSE registers (via 
> assembler) ... (seen in snapraid) which gives a factor 2.5 for the BPC part 
> ... 

Great.

> I needed to add a test for this in arch/intel.c like it was for the crc32c 
> register .

Ok. 

It also occurs to me that the benchmark should show the erasure code overhead. 
I.e. 10 + 4 by default means + 40% and with BPC using 5 chunks at a time it 
would be + 60%.

Cheers

> Cheers Andreas.
> 
> ________________________________________
> From: Mark Nelson [[email protected]]
> Sent: 11 December 2013 14:00
> To: Loic Dachary
> Cc: Andreas Joachim Peters; [email protected]
> Subject: Re: CEPH Erasure Encoding + OSD Scalability
> 
> On 12/11/2013 06:28 AM, Loic Dachary wrote:
>>
>>
>> On 11/12/2013 10:49, Andreas Joachim Peters wrote:> Hi Loic,
>>> I am a little bit confused which kind of tool you actually want. You want a 
>>> simple benchmark to check for degradation or you want a full profiler tool?
>>>
>>
>> I was not sure, hence the confusion.
>>
>>> Most of the external tools have the problem that you measure the whole 
>>> thing including buffer allocation and initialization. We probably don't 
>>> want to measure how long it takes to allocate memory and write random 
>>> numbers into it.
>>>
>>> I would just o:
>>>
>>> < prepare memory>
>>> <take CPU/realtime>
>>> < run algorithm >
>>> <take CPU/realtime>
>>> < print result>
>>>
>>
>> Ok, I'll do that.
>>
>> I'm glad I learnt about the other tools in the process, even if only to 
>> conclude that they are not needed.
> 
> Certainly things like perf are useful for profiling but may be overkill
> in the general case depending on what you are trying to do.  Collectl is
> pretty low overhead though if you are just looking for per-process CPU
> utilization stats.
> 
>>
>> Cheers
>>
>>> Now one can also add to run the perf-stat tool after <prepare memory> and 
>>> start it from within the test program pointing to the PID running <run 
>>> alogorithm>, so the benchmark would be:
>>>
>>> < prepare memory>
>>> < take CPU/realtime>
>>> < fork=>"perf stat -p <mypid>";
>>> < run algorithm n times>
>>> < take CPU/realtime>
>>> < SIGINT to fork>
>>> < print results>
>>>
>>> As an extension one could also add to have <run algorithm> with <n> threads 
>>> in a thread pool.
>>>
>>> Cheers Andreas.
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to [email protected]
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to