Re: parallelizing crashtest runs (was: minutes of ESC call ...)

2014-11-03 Thread Wols Lists
On 03/11/14 00:28, Markus Mohrhard wrote:
 The new script should scale nearly perfectly. There are still a few
 enhancements on my list so if anyone is interested in python tasks
 please talk to me.

I could be completely off, but this makes me think of running an update
on gentoo. make can restrict it to x processes at a time (advised as
being number of processors plus 1), or (and I don't know how this is
done) it monitors load, and only fires off new processes if load is
below a target level (again, I'd guess it should default to number of
processors).

Don't know how practical it would be for someone to try and code that...

Cheers,
Wol
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: parallelizing crashtest runs (was: minutes of ESC call ...)

2014-11-02 Thread Markus Mohrhard
Hey,

On Fri, Oct 31, 2014 at 2:45 PM, Christian Lohmaier
lohma...@googlemail.com wrote:
 Hi Markus, *,

 On Fri, Oct 31, 2014 at 2:38 PM, Markus Mohrhard
 markus.mohrh...@googlemail.com wrote:

 The quick and ugly one is to partition the directories into 100 file
 directories. I have a script for that as I have done exactly that for
 the memcheck run on the 70 core Largo server. It is a quick and ugly
 implementation.
 The clean and much better solution is to move away from directory
 based invocation and partion by files on the fly.

 Yeah, I also thought of keeping the per-directory/filetype processing,
 but instead run multiple dirs at once, rather divide the set of files
 of a given dir into the number of workers chunks.

 I have a
 proof-of-concept somewhere on my machine and will push a working
 version during the next days.

 nice :-)



So a working version is currently running on the VM. The version in
the repo will be updated as soon as the script finishes without a
problem. It parallelizes now nearly perfectly as it divides the work
in 100 file chunks and works on them. This means that after the last
update of the test files we have 641 jobs that will be put into a
queue and we process as many jobs in parallel as we want (5 at the VM
at the moment).

Additionally the updated version of the script no longer hard codes a
mapping from the file extension to the component and instead queries
LibreOffice to see which component opened the file. That allows to
remove quite a few mappings and will result in all file types to be
imported. The old version only imported file types that were
registered.

The new script should scale nearly perfectly. There are still a few
enhancements on my list so if anyone is interested in python tasks
please talk to me.

Regards,
Markus
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


parallelizing crashtest runs (was: minutes of ESC call ...)

2014-10-31 Thread Christian Lohmaier
Hi *,

On Thu, Oct 30, 2014 at 5:39 PM, Michael Meeks
michael.me...@collabora.com wrote:

 * Crashtest futures / automated test scripts (Markus)
 + call on Tuesday; new testing hardware.
 + result - get a Manitu server  leave room in the budget for
   ondemand Amazon instances (with spot pricing) if there is
   special need at some point.
 [...]

When I played with the crashtest setup I noticed some limitations in
the current layout of the crashtest-setup that prevents just using
lots of cores/high parallelism to get faster results.

The problem is that it is parallelized per directory, but the amount
of files in a directory is not evenly distributed at all. So when the
script decides to start odt tests last, the whole set of odt files
will only be tested in one thread, leaving the other CPU-cores idling
around with nothing to do.

I did add a sorting statement to the script, so it will start with the
directories with most files[1], but even with that you run into the
problem that towards the end of the testrun not all cores will be
used. As the AMD Opterons in the Manitu ones are less capable per-cpu
this will set a limit to how much you can accelerate the run by just
assigning more cores to it.

Didn't look into the overall setup to know whether just segmenting the
large directories into smaller ones is easy to do or not (i.e instead
of having one odt dir with 10500+ files, have 20 with ~ 500 each.

ciao
Christian

[1] added the sorted statement that uses the number of files in the
directory as the key to sort by:

def get_numfiles(directory):
return len([f for f in os.listdir(directory)])

def get_directories():
d='.'
directories = [o for o in os.listdir(d) if os.path.isdir(os.path.join(d,o))]
return sorted(directories, key=get_numfiles, reverse=True)
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: parallelizing crashtest runs (was: minutes of ESC call ...)

2014-10-31 Thread Michael Meeks
Hi Christian,

On Fri, 2014-10-31 at 14:23 +0100, Christian Lohmaier wrote:
 When I played with the crashtest setup I noticed some limitations in
 the current layout of the crashtest-setup that prevents just using
 lots of cores/high parallelism to get faster results.

Oh - these sound a bit silly =)

 The problem is that it is parallelized per directory, but the amount
 of files in a directory is not evenly distributed at all. So when the
 script decides to start odt tests last, the whole set of odt files
 will only be tested in one thread, leaving the other CPU-cores idling
 around with nothing to do.

Interesting; if we know how many cores we have, surely we can just get
each thread to do a 'readdir' and divide that into n chunks - and
tackle the N'th of those (?) Or is the reason we do that to make
stitching together the reports simpler ?

 Didn't look into the overall setup to know whether just segmenting the
 large directories into smaller ones is easy to do or not (i.e instead
 of having one odt dir with 10500+ files, have 20 with ~ 500 each.

Presumably there is no real reason to do anything odd to the
file-system - we can partition the work in whatever way seems best (?)
with some better, but still simple algorithm for partitioning /
reporting ? but - honestly, it's no use asking me - this is Markus' baby
- I'm sure he has a plan =)

ATB,

Michael.

-- 
 michael.me...@collabora.com  , Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: parallelizing crashtest runs (was: minutes of ESC call ...)

2014-10-31 Thread Markus Mohrhard
Hey,

On Fri, Oct 31, 2014 at 2:23 PM, Christian Lohmaier
lohma...@googlemail.com wrote:
 Hi *,

 On Thu, Oct 30, 2014 at 5:39 PM, Michael Meeks
 michael.me...@collabora.com wrote:

 * Crashtest futures / automated test scripts (Markus)
 + call on Tuesday; new testing hardware.
 + result - get a Manitu server  leave room in the budget for
   ondemand Amazon instances (with spot pricing) if there is
   special need at some point.
 [...]

 When I played with the crashtest setup I noticed some limitations in
 the current layout of the crashtest-setup that prevents just using
 lots of cores/high parallelism to get faster results.

 The problem is that it is parallelized per directory, but the amount
 of files in a directory is not evenly distributed at all. So when the
 script decides to start odt tests last, the whole set of odt files
 will only be tested in one thread, leaving the other CPU-cores idling
 around with nothing to do.

 I did add a sorting statement to the script, so it will start with the
 directories with most files[1], but even with that you run into the
 problem that towards the end of the testrun not all cores will be
 used. As the AMD Opterons in the Manitu ones are less capable per-cpu
 this will set a limit to how much you can accelerate the run by just
 assigning more cores to it.

 Didn't look into the overall setup to know whether just segmenting the
 large directories into smaller ones is easy to do or not (i.e instead
 of having one odt dir with 10500+ files, have 20 with ~ 500 each.

 ciao
 Christian

 [1] added the sorted statement that uses the number of files in the
 directory as the key to sort by:

 def get_numfiles(directory):
 return len([f for f in os.listdir(directory)])

 def get_directories():
 d='.'
 directories = [o for o in os.listdir(d) if 
 os.path.isdir(os.path.join(d,o))]
 return sorted(directories, key=get_numfiles, reverse=True)


This is currently a known limitation but there are two solutions to the problem:

The quick and ugly one is to partition the directories into 100 file
directories. I have a script for that as I have done exactly that for
the memcheck run on the 70 core Largo server. It is a quick and ugly
implementation.
The clean and much better solution is to move away from directory
based invocation and partion by files on the fly. I have a
proof-of-concept somewhere on my machine and will push a working
version during the next days. This would even give us about half a day
on our current setup as ods and odt are normally the last two running
for about half a day longer than the rest of the script.

With both solutions this scales perfectly. We have already tested it
on the Largo server where I was able to keep a load of 70 for exactly
a week (with memcheck but that does only affect the overall runtime).

Regards,
Markus
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: parallelizing crashtest runs (was: minutes of ESC call ...)

2014-10-31 Thread Christian Lohmaier
Hi Markus, *,

On Fri, Oct 31, 2014 at 2:38 PM, Markus Mohrhard
markus.mohrh...@googlemail.com wrote:

 The quick and ugly one is to partition the directories into 100 file
 directories. I have a script for that as I have done exactly that for
 the memcheck run on the 70 core Largo server. It is a quick and ugly
 implementation.
 The clean and much better solution is to move away from directory
 based invocation and partion by files on the fly.

Yeah, I also thought of keeping the per-directory/filetype processing,
but instead run multiple dirs at once, rather divide the set of files
of a given dir into the number of workers chunks.

 I have a
 proof-of-concept somewhere on my machine and will push a working
 version during the next days.

nice :-)

ciao
Christian
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice