Re: enhancing perfcheck - Proof of concept & proposals

2014-10-30 Thread Matúš Kukan
Hi,

On Tue, 2014-10-28 at 07:51 +0100, Laurent Godard wrote:
> Hi Matus,
> 
> Thanks a lot for your detailled response
> 
> I'm actually in holidays with family and will be back next week; I hope 
> then to be able to continue this work (but feel free to start it !)

enjoy :-)

> Regarding output results, the approach i took was to first gather all 
> the results in the csv file
> Then post-processing it in a spreadsheet (say, randomly, calc) and use 
> standard filters to isolate the tests

Ah, I see. Nice, I would not be able to come up with something like that :-)

> >> - set IS_PERFCHECK
> >
> > this would be set all the time (not needed)
> > make perfcheck would just run tests under callgrind, where it makes sense
> >
> 
> ok but i still do not catch how the beast would know if callgrind is 
> running or not. how would startImplemntation and endImplementation 
> behave if callgrind is not running (case of normal tests)

callgrind runs if you use 'make perfcheck' or set gb_CppunitTest_VALGRINDTOOL
explicitely.
If callgrind does not run, startImplementation should do nothing.

So, I've done
http://cgit.freedesktop.org/libreoffice/core/commit/?id=e4e7f9d88e05fa610a72245c40f4e47f85db61ff

Hopefully it will help us.
Example test: https://gerrit.libreoffice.org/#/c/11296/

Would be nice to rebase your patches and merge them.

Thanks,
Matus


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: enhancing perfcheck - Proof of concept & proposals

2014-10-27 Thread Laurent Godard

Hi Matus,

Thanks a lot for your detailled response

I'm actually in holidays with family and will be back next week; I hope 
then to be able to continue this work (but feel free to start it !)



I've added it to loperf; it's now in
http://dev-builds.libreoffice.org/callgrind_report/

So, there is no need for the load test now, I think.



Great, thanks a lot.


but I think it's a bit different with callgrind - you see everything there.
And with 100 items, you should be already able to spot quadratic complexity.



Ok then to reduce the 'big file' size and not instrument its loading
I'll do it when back. I'll rebuild a new one


Regarding output results, the approach i took was to first gather all 
the results in the csv file
Then post-processing it in a spreadsheet (say, randomly, calc) and use 
standard filters to isolate the tests



Instead something like
commit date - ... -

   ... numbers for different tests
   ...



will be restricted to 1024 columns (500 tests) but may be enough


would be better I think - you can just compare
number_1 and number_2 on the next row in the same column.
Hopefully this makes sense.



Sure it makes sense :)


The script would need to be clever enough to add new columns when needed 
somehow.
Or, another possibility is to have another script which would generate
file with the second format mentioned here from the existing csv file.



Yes, i'll have a look a this. Then we could have both formats (in case 
in needed) and this second script could also dump alerts



- set IS_PERFCHECK


this would be set all the time (not needed)
make perfcheck would just run tests under callgrind, where it makes sense



ok but i still do not catch how the beast would know if callgrind is 
running or not. how would startImplemntation and endImplementation 
behave if callgrind is not running (case of normal tests)



I am happy to hack this in, or help you with implementation - as you choose.



as said before, i won't be able to work on this until middle of next 
week, sorry


So feel free to enhance this and when joining back, i'll start on the 
evolutions you may have implemented


thanks a lot Matus

Laurent
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: enhancing perfcheck - Proof of concept & proposals

2014-10-27 Thread Matúš Kukan
Hi,

On Wed, 2014-10-22 at 12:18 +0200, Laurent Godard wrote:
> I instrumented the big file load for testing purpose but yes, in 
> absolute, i'm also interrested in perf check of such files loading (and 
> even saving)

I've added it to loperf; it's now in
http://dev-builds.libreoffice.org/callgrind_report/

So, there is no need for the load test now, I think.
 
> i agree to keep the instrumentation as slow as possible, but in my 
> experience, some perf problem start to appear exponentially with file 
> complexity (a lot of sheets/formulas/named ranges/cell notes)

heh, hopefully not exponential but 'just' quadratic or something like that :-)
Yes, to see problems, the data have to be resonable big often,
but I think it's a bit different with callgrind - you see everything there.
And with 100 items, you should be already able to spot quadratic complexity.

> > Well, this is good but it's hard to parse the results quickly.
> > Do you think we could have date/commit in one line with all numbers?
> > And descriptions somewhere at the top.
> > So that we could compare results in one column easily (and draw graphs..)
> > Something like
> > http://dev-builds.libreoffice.org/callgrind_report/history.fods
> >
> 
> this is what is intended to be done
> 
> the output is a tabulated separated csv file, with all the information 
> on a single line (and description at top)

yes, it is but I meant something else - current description is not that usefull.
We have

lastCommit test_name filedatetime dump_comment count


... other tests
...


and it's hard to see how situation has improved between
commit and commit+1 for - test

Instead something like
commit date - ... - 

   ... numbers for different tests
   ...

would be better I think - you can just compare
number_1 and number_2 on the next row in the same column.
Hopefully this makes sense.
 
The problem is adding new tests - they would need new columns.
The script would need to be clever enough to add new columns when needed 
somehow.
Or, another possibility is to have another script which would generate
file with the second format mentioned here from the existing csv file.

> > Or - even better - we could just compile in the callgrind code all the time 
> > and decide when
> > running make, whether we want to run under valgrind --tool=callgrind or not 
> > (or both).
> > If that works. :-)
> > So, something like IS_PERFCHECK is always true, no duplication
> > and only decide whether to run under valgrind.
> >
> > Does that make sense?
> > What do you think?
> >
> 
> i like the approach as it will simplify the trickiest part (the nasty 
> include to avoid double linking problem)

indeed

> imho, it would be clearer to keep some 'make perfcheck' command but this 
> would only
> - set IS_PERFCHECK

this would be set all the time (not needed)
make perfcheck would just run tests under callgrind, where it makes sense

> feel free to give me some code pointers (remember, i'm only a poor 
> scripter, always-beginner in core stuff ;) )

Do you want to work on this?
It's just a matter of adding some include/test/callgrind.hxx
and using that instead of macros with content similar to yours

and some makefiles hacking I guess.
I am happy to hack this in, or help you with implementation - as you choose.

Best,
Matus


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: enhancing perfcheck - Proof of concept & proposals

2014-10-22 Thread Laurent Godard

Hi Matus

First, thanks a lot for your answer


Most of that is just loading the file - maybe we could use 'loperf' for
testing import/export and do only the rest as perfchecks? What do you think?



I instrumented the big file load for testing purpose but yes, in 
absolute, i'm also interrested in perf check of such files loading (and 
even saving)


i agree to keep the instrumentation as slow as possible, but in my 
experience, some perf problem start to appear exponentially with file 
complexity (a lot of sheets/formulas/named ranges/cell notes)


so the first idea of this big file was to gather all the potential 
problems and instrument each case




2- exploiting results
-



$ cat perfcheckResult.csv
lastCommit  test name   filedatetimedump commentcount
741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj
10/14/2014 09:54:52 testSheetFindAll - Search value 11403647297


Fun, for me Search value is more than 10x faster - Was there some fix recently?
10/22/2014 08:27:58 testSheetFindAll - Search value 766042247


i work on 2 a weeks old branch
would be great if things evolved here ;-)



Well, this is good but it's hard to parse the results quickly.
Do you think we could have date/commit in one line with all numbers?
And descriptions somewhere at the top.
So that we could compare results in one column easily (and draw graphs..)
Something like
http://dev-builds.libreoffice.org/callgrind_report/history.fods



this is what is intended to be done

the output is a tabulated separated csv file, with all the information 
on a single line (and description at top)


may be a bad email layout ?
btw, i'll double check




3- re use of existing tests for percheck





So - now that I think about it.
Maybe it would be better to stop duplicating makefiles too.


yes that would simplify the beast, providing we can start the 
instrumentation (and disable it on running normal tests)




Or - even better - we could just compile in the callgrind code all the time and 
decide when
running make, whether we want to run under valgrind --tool=callgrind or not (or 
both).
If that works. :-)
So, something like IS_PERFCHECK is always true, no duplication
and only decide whether to run under valgrind.

Does that make sense?
What do you think?



i like the approach as it will simplify the trickiest part (the nasty 
include to avoid double linking problem)


imho, it would be clearer to keep some 'make perfcheck' command but this 
would only

- set IS_PERFCHECK
- start running callgrind
- launch all the tests with clear parts identified with the 
start/endInstrumentation


maybe do i miss something as it is not far from the actual thing

feel free to give me some code pointers (remember, i'm only a poor 
scripter, always-beginner in core stuff ;) )


Thanks a lot again matus

Laurent
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: enhancing perfcheck - Proof of concept & proposals

2014-10-22 Thread Matúš Kukan
Hi Laurent,

On Wed, 2014-10-15 at 14:26 +0200, Laurent Godard wrote:
> 
> i'll present in the comming line the approach i've tested regarding make 
> perfcheck

That's great, thanks for working on this.

> 1- percheck tests not restricted at class level
> ---
> 
> The problem being at classe level is that some tests may be very long 
> and hide problems on other performed tests. Time can be doubled without 
> noticing it
> 
> The goal here is to be able to DUMP callgrind each time we want it
> 
> Instead of using
> CALLGRIND_DUMP_STATS
> 
> one can use
> CALLGRIND_DUMP_STATS_AT(message);

The problem was with processing the data but your python script looks good,
so, yes, it's much better to use CALLGRIND_DUMP_STATS_AT.

> a first approach of a "big" test, with 3 dumps is here
> https://gerrit.libreoffice.org/#/c/11949/

I will also put some comments there,
but in general, I think it's good to keep the tests as small as possible.
This one runs 5minutes for me doing some 18bn cycles.
Most of that is just loading the file - maybe we could use 'loperf' for
testing import/export and do only the rest as perfchecks? What do you think?

If you have specific documents you would like to add for loperf, let me know.

> 2- exploiting results
> -
> 
> The dumps are stored in workdir/CppunitTest/
> 
> Each dump file starting with callgrind.out contains
> - the message of step 1
> - the total of cycles
> 
> here is a python script that walk through workdir/CppunitTest/ and 
> retreive these 2 information for each dump file
> It also retreives the currently last commit

I like this approach :-).. a nice idea
We don't need to care about CALLGRIND_DUMP_STATS anymore,
which is really cool, thanks.

> it append these results in a csv file
> TODO: use this csv file to monitor or detect problems
> 
> https://gerrit.libreoffice.org/#/c/11962/
> 
> output example
> 
> $ cat perfcheckResult.csv
> lastCommit  test name   filedatetimedump commentcount
> 741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
> 10/14/2014 09:54:52 testSheetFindAll - Search value 11403647297

Fun, for me Search value is more than 10x faster - Was there some fix recently?
10/22/2014 08:27:58 testSheetFindAll - Search value 766042247

> 741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
> 10/14/2014 09:54:53 testSheetFindAll - Search style 767867
> 741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
> 10/14/2014 09:52:06 Load Big File   17078422628

Well, this is good but it's hard to parse the results quickly.
Do you think we could have date/commit in one line with all numbers?
And descriptions somewhere at the top.
So that we could compare results in one column easily (and draw graphs..)
Something like
http://dev-builds.libreoffice.org/callgrind_report/history.fods


> 3- re use of existing tests for percheck
> 
> 
> i tried this approach because monitoring perfs would lead to write 
> duplicate tests

Yes, we really need to reuse them.

> the basic idea, in an existing test, would be to write something like
> 
> startPerfInstrumentation();
>uno::Reference< container::XIndexAccess > xIndex = 
> xSearchable->findAll(xSearchDescr);
>endPerfInstrumentation("testSheetFindAll - Search value");
> 
> where startPerfInstrumentation and endPerfInstrumentation do nothing if 
> not in a perfcheck context

Yep
 
> see whole code example at
> https://gerrit.libreoffice.org/#/c/11982/
> 
> this context is set using$(eval $(call 
> gb_CppunitTest_add_defs,sc_perf_searchobj,\
>  -DIS_PERFCHECK \
> ))
> 
> see then
> https://gerrit.libreoffice.org/#/c/11982/2/sc/qa/perf/perf_instrumentation.cxx,cm
> 
> the idea is to have 2 make files
> - one for subsequent test
> - one for perfcheck (that sets IS_PERFCHECK)
> 
> that point to the same source test but lets perf_instrumentation.cxx be 
> re-build each time

So - now that I think about it.
Maybe it would be better to stop duplicating makefiles too.
We would use something like ENABLE_PERFTESTS (--enable-performance-testing)
- would be cool to create reasonable name for it -
and compile everything just once based on that.

Or - even better - we could just compile in the callgrind code all the time and 
decide when
running make, whether we want to run under valgrind --tool=callgrind or not (or 
both).
If that works. :-)
So, something like IS_PERFCHECK is always true, no duplication
and only decide whether to run under valgrind.

Does that make sense?
What do you think?

> ok, i stop here, ask me if something is not clear

:-) thanks a lot.

All the best,

Matus


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


enhancing perfcheck - Proof of concept & proposals

2014-10-16 Thread Laurent Godard

Hi all

First, i must warn that i have little experience in c++ development; So 
my wording may be strange and inacurate. Feel free to request for 
rewording if something is not clear


manys thanks to all the devs who helped me on this (matus, mst)

i'll present in the comming line the approach i've tested regarding make 
perfcheck


The initial work has been pointing by Matus
The idea is to use valgrind while running tests
my starting point has been https://gerrit.libreoffice.org/#/c/11296

My goal was to enhance this with in mind that make perfcheck is only 
launched on demand as it is a costly process


Here are the adressed points; I present them in the order of which i 
developped them, may be inapropriate


1- percheck tests not restricted at class level
2- exploiting results
3- re use of existing tests for percheck

I'll point the gerrit commits in each of these points
They are first shoots and may need some (a lot of) polish
at least it builds and do what expected

Any comments welcome

my code is far from being perfect
let me know if it needs something to be done before being pushed to master

Thanks again to all of the dev that helped me


Laurent

==

1- percheck tests not restricted at class level
---

The problem being at classe level is that some tests may be very long 
and hide problems on other performed tests. Time can be doubled without 
noticing it


The goal here is to be able to DUMP callgrind each time we want it

Instead of using
CALLGRIND_DUMP_STATS

one can use
CALLGRIND_DUMP_STATS_AT(message);

--> this will create a callgrind.out file at each call
--> then we can separate the different part of a test
--> note the (message) that will be used in part 2

a first approach of a "big" test, with 3 dumps is here
https://gerrit.libreoffice.org/#/c/11949/

2- exploiting results
-

The dumps are stored in workdir/CppunitTest/

Each dump file starting with callgrind.out contains
- the message of step 1
- the total of cycles

here is a python script that walk through workdir/CppunitTest/ and 
retreive these 2 information for each dump file

It also retreives the currently last commit

it append these results in a csv file
TODO: use this csv file to monitor or detect problems

https://gerrit.libreoffice.org/#/c/11962/

output example

$ cat perfcheckResult.csv
lastCommit  test name   filedatetimedump commentcount
741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
10/14/2014 09:54:52 testSheetFindAll - Search value 11403647297
741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
10/14/2014 09:54:53 testSheetFindAll - Search style 767867
741c661ece19ccb4e94bb20ceb75d89a29b1b2a8sc_perf_searchobj 
10/14/2014 09:52:06 Load Big File   17078422628


3- re use of existing tests for percheck


i tried this approach because monitoring perfs would lead to write 
duplicate tests


the naive question was : why not reuse existing tests that would start 
callgrind instrumentation only if we are in a perfcheck context ?


it would then be transparent for subsequentcheck and enhance perftest 
coverage


as we saw at step 1, one can start callgrind instrumentation for only 
few chosen lines


the basic idea, in an existing test, would be to write something like

startPerfInstrumentation();
  uno::Reference< container::XIndexAccess > xIndex = 
xSearchable->findAll(xSearchDescr);

  endPerfInstrumentation("testSheetFindAll - Search value");

where startPerfInstrumentation and endPerfInstrumentation do nothing if 
not in a perfcheck context


see whole code example at
https://gerrit.libreoffice.org/#/c/11982/

this context is set using$(eval $(call 
gb_CppunitTest_add_defs,sc_perf_searchobj,\

-DIS_PERFCHECK \
))

see then
https://gerrit.libreoffice.org/#/c/11982/2/sc/qa/perf/perf_instrumentation.cxx,cm

the idea is to have 2 make files
- one for subsequent test
- one for perfcheck (that sets IS_PERFCHECK)

that point to the same source test but lets perf_instrumentation.cxx be 
re-build each time


globally seems to work with some tricky parts
(thanks mst for your help)
https://gerrit.libreoffice.org/#/c/11982/2/sc/qa/perf/scperfsearch.cxx,cm
i don't know if it can lead to some problems

ok, i stop here, ask me if something is not clear

--
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice