Re: Another lintian release for squeeze?

2010-03-20 Thread Raphael Geissert
Raphael Geissert wrote:
 I've been working on Lintian::Command::Simple but got
 stuck with the interface. I should probably push it somewhere and ask for
 comments.
 
 I've also done some work on making t/runtests run multiple jobs in
 parallel (using perl threads, actually). There's just one minor glitch I
 should be able to fix within a few minutes.
 The only downside is that the output is not clean, but unless I buffer it
 (which won't make it really show in what order stuff is being done)
 there's no other way around.
 

I'm attaching both changes. Comments? suggestions?

0007 includes the first set of changes of Lintian::Command::Simple. In the 
.t file I was trying to decide the best way to handle multiple jobs while 
still being able to recognise which one is reaped.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net
From 93630fcb67991bb2c68dc45706b080043298f680 Mon Sep 17 00:00:00 2001
From: Raphael Geissert atom...@gmail.com
Date: Sat, 20 Mar 2010 00:14:03 -0600
Subject: [PATCH] Run multiple tests from the testsuite in parallel

Experimental implementation using Perl threads.

Output is messy and the benefit is not _that_ great. Most of the tools
(debhelper, dpkg-*, etc) turn the speed completely CPU-bound.
---
 t/runtests |  204 ++--
 1 files changed, 156 insertions(+), 48 deletions(-)

diff --git a/t/runtests b/t/runtests
index 9f198e9..d29ae62 100755
--- a/t/runtests
+++ b/t/runtests
@@ -32,6 +32,8 @@ use warnings;
 use Data::Dumper;
 use Getopt::Long qw(GetOptions);
 use Text::Template;
+use threads 'exit' = 'threads_only';
+use threads::shared;
 
 BEGIN {
 my $LINTIAN_ROOT = $ENV{'LINTIAN_ROOT'};
@@ -68,13 +70,16 @@ our $STANDARDS_VERSION = '3.8.4';
 
 sub usage {
 print unquote(END);
-:   Usage: $0 [-dkv] testset-directory testing-directory [test]
-:  $0 [-dkv] [-t tag] testset-directory testing-directory
+:   Usage: $0 [-dkv] [-j [jobs]] testset-directory testing-directory [test]
+:  $0 [-dkv] [-j [jobs]] [-t tag] testset-directory testing-directory
 :
-: -dDisplay additional debugging information
-: -kDo not stop after one failed test
-: -t tag  Run only tests for or against tag
-: -vBe more verbose
+: -d  Display additional debugging information
+: -j [jobs] Run up to jobs jobs in parallel. Defaults to two.
+: If -j is passed without specifying jobs, the number
+: of jobs started is cpu cores+1 if /proc/cpuinfo is readable.
+: -k  Do not stop after one failed test
+: -t tagRun only tests for or against tag
+: -v  Be more verbose
 :
 :   The optional 3rd parameter causes runtests to only run that particular
 :   test.
@@ -88,10 +93,12 @@ our $DEBUG = 0;
 our $VERBOSE = 0;
 our $RUNDIR;
 our $TESTSET;
+our $JOBS = -1;
 
 my ($run_all_tests, $tag);
 Getopt::Long::Configure('bundling');
 GetOptions('d|debug'  = \$DEBUG,
+	   'j|jobs:i' = \$JOBS,
 	   'k|keep-going' = \$run_all_tests,
 	   't|tag=s'  = \$tag,
 	   'v|verbose'= \$VERBOSE) or usage;
@@ -110,6 +117,31 @@ unless (-d $TESTSET) {
 fail(test set directory $TESTSET does not exist);
 }
 
+# Getopt::Long assigns 0 as default value if none was specified
+if ($JOBS eq 0  -r '/proc/cpuinfo') {
+open(CPU, '', '/proc/cpuinfo')
+	or fail(failed to open /proc/cpuinfo: $!);
+while (CPU) {
+	next unless m/^cpu cores\s*:\s*(\d+)/;
+	$JOBS += $1;
+}
+close(CPU);
+
+print Apparent number of cores: $JOBS\n if $DEBUG;
+
+# Running up to twice the number of cores usually gets the most out
+# of the CPUs and disks but it might be too aggresive to be the
+# default for -j. Only use cores+1 then.
+$JOBS++;
+}
+
+# No decent number of jobs? set a default
+# Above $JOBS should be set to -1 so that this condition is always met,
+# therefore avoiding duplication.
+if ($JOBS le 0) {
+$JOBS = 2;
+}
+
 # --- Display output immediately
 
 $| = 1;
@@ -124,9 +156,16 @@ my $status = 0;
 
 # If we don't run any tests, we'll want to warn that we couldn't find
 # anything.
-my $tests_run = 0;
+my $tests_run :shared = 0;
+
+# $JOBS is the limit, $jobs is how many there are left to be started
+my $jobs = $JOBS;
+
+# a stack with the created threads
+my @threads;
+
+my @tests :shared;
 
-my @tests;
 my $prev;
 
 # --- Run all test scripts
@@ -145,7 +184,7 @@ if ($singletest) {
 
 if (@tests) {
 print Test scripts:\n;
-if (system('prove', '-r', '-I', $LINTIAN_ROOT/lib, @tests) != 0) {
+if (system('prove', '-j', $JOBS, '-r', '-I', $LINTIAN_ROOT/lib, @tests) != 0) {
 	exit 1 unless $run_all_tests;
 	$status = 1;
 }
@@ -178,14 +217,29 @@ if ($singletest) {
 }
 print Found the following changes tests: @tests\n if $DEBUG;
 print Changes tests:\n if @tests;
-for (@tests) {
-my $okay = test_changes($_);
-   

Re: Another lintian release for squeeze?

2010-03-20 Thread Russ Allbery
Raphael Geissert geiss...@debian.org writes:

 I'm attaching both changes. Comments? suggestions?

 0007 includes the first set of changes of Lintian::Command::Simple. In
 the .t file I was trying to decide the best way to handle multiple jobs
 while still being able to recognise which one is reaped.

Is there any way that we can fix the output handling so that at least it
won't intersperse output from multiple threads?  Making failures basically
unreadable is unappealing, and I assume that's the possible result.  Can
we use some sort of locking method so that only one thread is printing
stuff to the terminal at a time and finishes dumping its stuff, including
its possible diff, before letting someone else go?

In parallel mode, we should stop printing partial status (building,
testing, OK) etc. and just print out the complete line to the point that
we got and then the failure results if any all at once.  That will work
better with the output handling.

Lintian::Command::Simple looks like a good idea to me, but please don't
call the system() function exec().  I will keep expecting it to be, well,
exec.  :)

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8739zur7te@windlord.stanford.edu



Re: Another lintian release for squeeze?

2010-03-20 Thread Raphael Geissert
Russ Allbery wrote:

 Raphael Geissert writes:
 
 I'm attaching both changes. Comments? suggestions?
 
 0007 includes the first set of changes of Lintian::Command::Simple. In
 the .t file I was trying to decide the best way to handle multiple jobs
 while still being able to recognise which one is reaped.
 
 Is there any way that we can fix the output handling so that at least it
 won't intersperse output from multiple threads?  Making failures basically
 unreadable is unappealing, and I assume that's the possible result.  Can
 we use some sort of locking method so that only one thread is printing
 stuff to the terminal at a time and finishes dumping its stuff, including
 its possible diff, before letting someone else go?
 
 In parallel mode, we should stop printing partial status (building,
 testing, OK) etc. and just print out the complete line to the point that
 we got and then the failure results if any all at once.  That will work
 better with the output handling.
 
 Lintian::Command::Simple looks like a good idea to me, but please don't
 call the system() function exec().  I will keep expecting it to be, well,
 exec.  :)

Heh, yeah. Those were terribly-chosen names but I lacked imagination that 
day :)
What do you suggest to use as names instead of fork() and exec()? what about 
the interface to reap jobs?

Maybe wait(), when passed a hash ref, should return the value of the hash 
member that was reaped, when called in scalar context. In array context it 
should probably return the key, value pair.

It seems that the only way to achieve what I want requires wait() to:
a) call CORE::wait() to get the pid and $? of the reaped process.
b) call $cmd-pid() for every member of the hash it was passed to see which 
of the processes was the one that finished. Needs to be done this way 
because we could otherwise end up reaping more jobs, if waitpid($pid, 
WNOHANG) was used.
c) tell the $cmd object what the return status was. This requires a getter 
and a setter to be added to the OO interface. The former should probably 
refuse to set the return status if $self-wait() doesn't return -1.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net



-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/ho3img$sk...@dough.gmane.org



Re: Another lintian release for squeeze?

2010-03-20 Thread Russ Allbery
Raphael Geissert geiss...@debian.org writes:

 Heh, yeah. Those were terribly-chosen names but I lacked imagination
 that day :)  What do you suggest to use as names instead of fork() and
 exec()?

background() and run() maybe?

 what about the interface to reap jobs?

wait() seems fine there.  It's doing basically the same thing as
CORE::wait().

 Maybe wait(), when passed a hash ref, should return the value of the
 hash member that was reaped, when called in scalar context. In array
 context it should probably return the key, value pair.

Seems reasonable to me.

 It seems that the only way to achieve what I want requires wait() to:
 a) call CORE::wait() to get the pid and $? of the reaped process.
 b) call $cmd-pid() for every member of the hash it was passed to see which 
 of the processes was the one that finished. Needs to be done this way 
 because we could otherwise end up reaping more jobs, if waitpid($pid, 
 WNOHANG) was used.
 c) tell the $cmd object what the return status was. This requires a getter 
 and a setter to be added to the OO interface. The former should probably 
 refuse to set the return status if $self-wait() doesn't return -1.

Yup, that sounds right.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/877hp6pqcy@windlord.stanford.edu



Re: Another lintian release for squeeze?

2010-03-20 Thread Russ Allbery
Raphael Geissert geiss...@debian.org writes:
 Russ Allbery wrote:

 Is there any way that we can fix the output handling so that at least
 it won't intersperse output from multiple threads?  Making failures
 basically unreadable is unappealing, and I assume that's the possible
 result.  Can we use some sort of locking method so that only one thread
 is printing stuff to the terminal at a time and finishes dumping its
 stuff, including its possible diff, before letting someone else go?

 I don't think it's possible to lock the file descriptors.

Yeah, but you don't need to.  You can use a separate variable as mutex
lock.

 Since doing this is going to take some time, is there any objection for
 merging the initial -j option support to at least make prove run
 multiple jobs? (i.e. not merging the 'use threads' part.)

Oh, sure, I have no objections to that.  I don't really have any
objections to merging the support for parallel tests in general, just
don't want to make it the default until we figure out how to handle the
output.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87k4t6migx@windlord.stanford.edu



Re: Another lintian release for squeeze?

2010-03-20 Thread Raphael Geissert
Russ Allbery wrote:

 Raphael Geissert writes:
 Russ Allbery wrote:
 
 Is there any way that we can fix the output handling so that at least
 it won't intersperse output from multiple threads?  Making failures
 basically unreadable is unappealing, and I assume that's the possible
 result.  Can we use some sort of locking method so that only one thread
 is printing stuff to the terminal at a time and finishes dumping its
 stuff, including its possible diff, before letting someone else go?
 
 I don't think it's possible to lock the file descriptors.
 
 Yeah, but you don't need to.  You can use a separate variable as mutex
 lock.

Sure. The problem is that the output of subcommands is not under the control 
of the thread and as such it can't lock in case of failure or unexpected 
writes.

 
 Since doing this is going to take some time, is there any objection for
 merging the initial -j option support to at least make prove run
 multiple jobs? (i.e. not merging the 'use threads' part.)
 
 Oh, sure, I have no objections to that.  I don't really have any
 objections to merging the support for parallel tests in general, just
 don't want to make it the default until we figure out how to handle the
 output.

Since by default it defaults to using two jobs, I'm going to hold the other 
changes for now.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net



-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/ho479i$th...@dough.gmane.org



Another lintian release for squeeze?

2010-03-19 Thread Raphael Hertzog
Hello,

have you planned another lintian release for squeeze? I would like to see my
debian/source/format related checks (#566820) merged in the lintian
version that will be in squeeze.

Cheers,
-- 
Raphaël Hertzog -+- http://www.ouaza.com

Freexian : des développeurs Debian au service des entreprises
http://www.freexian.com


--
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100319080921.ga31...@rivendell



Re: Another lintian release for squeeze?

2010-03-19 Thread Russ Allbery
Raphael Hertzog raph...@ouaza.com writes:

 have you planned another lintian release for squeeze? I would like to
 see my debian/source/format related checks (#566820) merged in the
 lintian version that will be in squeeze.

I currently don't have any specific plans just because I haven't had any
time to look at Lintian beyond answer random e-mail in a while, but I plan
on sitting down this weekend and doing a major run through the BTS and try
to apply everything that's pending and fix as many minor bugs as I can.
If all goes according to plan, therefore, expect an upload with that and
many other things Sunday or so.

-- 
Russ Allbery (r...@debian.org)   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87fx3wjivy@windlord.stanford.edu



Re: Another lintian release for squeeze?

2010-03-19 Thread Raphael Geissert
Russ Allbery wrote:

 Raphael Hertzog writes:
 
 have you planned another lintian release for squeeze? I would like to
 see my debian/source/format related checks (#566820) merged in the
 lintian version that will be in squeeze.
 
 I currently don't have any specific plans just because I haven't had any
 time to look at Lintian beyond answer random e-mail in a while, but I plan
 on sitting down this weekend and doing a major run through the BTS and try
 to apply everything that's pending and fix as many minor bugs as I can.
 If all goes according to plan, therefore, expect an upload with that and
 many other things Sunday or so.
 

Ah, good to know. I've been working on Lintian::Command::Simple but got 
stuck with the interface. I should probably push it somewhere and ask for 
comments.

I've also done some work on making t/runtests run multiple jobs in parallel 
(using perl threads, actually). There's just one minor glitch I should be 
able to fix within a few minutes.
The only downside is that the output is not clean, but unless I buffer it 
(which won't make it really show in what order stuff is being done) there's 
no other way around.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net



-- 
To UNSUBSCRIBE, email to debian-lint-maint-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/ho14h3$d1...@dough.gmane.org