On 13/12/2013 16:47, Andreas Joachim Peters wrote:> Hi Loic, > > I (re-)pushed/fixed wip-bpc-01 in my GIT repository. >
There still seem to be an issue as https://github.com/ceph/ceph/pull/740 show to be unmergeable ( if such a word exists ;-). > There is one commit of general interest to 'galois.c' which gives me a factor > 1.5 speed improvement (I exchanged the region XOR loop with vector operations > if available) in the Jerasure code base. > Great. > I have also replaced the parity implementation to use SSE registers (via > assembler) ... (seen in snapraid) which gives a factor 2.5 for the BPC part > ... Great. > I needed to add a test for this in arch/intel.c like it was for the crc32c > register . Ok. It also occurs to me that the benchmark should show the erasure code overhead. I.e. 10 + 4 by default means + 40% and with BPC using 5 chunks at a time it would be + 60%. Cheers > Cheers Andreas. > > ________________________________________ > From: Mark Nelson [[email protected]] > Sent: 11 December 2013 14:00 > To: Loic Dachary > Cc: Andreas Joachim Peters; [email protected] > Subject: Re: CEPH Erasure Encoding + OSD Scalability > > On 12/11/2013 06:28 AM, Loic Dachary wrote: >> >> >> On 11/12/2013 10:49, Andreas Joachim Peters wrote:> Hi Loic, >>> I am a little bit confused which kind of tool you actually want. You want a >>> simple benchmark to check for degradation or you want a full profiler tool? >>> >> >> I was not sure, hence the confusion. >> >>> Most of the external tools have the problem that you measure the whole >>> thing including buffer allocation and initialization. We probably don't >>> want to measure how long it takes to allocate memory and write random >>> numbers into it. >>> >>> I would just o: >>> >>> < prepare memory> >>> <take CPU/realtime> >>> < run algorithm > >>> <take CPU/realtime> >>> < print result> >>> >> >> Ok, I'll do that. >> >> I'm glad I learnt about the other tools in the process, even if only to >> conclude that they are not needed. > > Certainly things like perf are useful for profiling but may be overkill > in the general case depending on what you are trying to do. Collectl is > pretty low overhead though if you are just looking for per-process CPU > utilization stats. > >> >> Cheers >> >>> Now one can also add to run the perf-stat tool after <prepare memory> and >>> start it from within the test program pointing to the PID running <run >>> alogorithm>, so the benchmark would be: >>> >>> < prepare memory> >>> < take CPU/realtime> >>> < fork=>"perf stat -p <mypid>"; >>> < run algorithm n times> >>> < take CPU/realtime> >>> < SIGINT to fork> >>> < print results> >>> >>> As an extension one could also add to have <run algorithm> with <n> threads >>> in a thread pool. >>> >>> Cheers Andreas. >>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to [email protected] >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
