Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-03-02 Thread Holger Levsen
Hi,

On Dienstag, 26. Februar 2013, Andreas Beckmann wrote:
 I'm primarily concerned about reimplementing a bad piece of code (the
 second half of dwke that creates the .tpl files) in order to build a new
 feature on top of it. The perfectionist in me would like to fix things
 properly first.

yes, but... the imperfect way was used quite successfully with piuparts for a 
long time ;-)
 
 I really do like the approach of reviewing patches before inclusion as
 more eyes may spot more problems

me too, absolutly.

Yet I also can only imagine Dave's frustration trying to get his work in and 
recognized, so far this hasn''t happen for this feature, and for quite a long 
time. And I'd like Dave to stay motivated and contributing, and I like the new 
feature also.

So I'm a bit torn, (currently) leaning towards releasing 0.50 soon (now?) and 
then starting 0.51 with the merge of dave/sort-issues-by-rdep - or do you 
think thats premature?


cheers,
Holger


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-26 Thread Andreas Beckmann
On 2013-02-25 20:24, Dave Steele wrote:
 On Mon, Feb 25, 2013 at 8:45 AM, Andreas Beckmann a...@debian.org wrote:
 In general I think we should allow the flexibility to have a per-section
 known-problems-directory setting, so each report Section should generate
 its own problem list and not get a global one passed

 OK, but out of scope of the patch set under consideration, which replaces
 the existing detect_well_known_errors with one that sorts by rdep.

In branch report-problem_integration you have
4db22544 piuparts-report - add known Problems class list to Section.

That should be dropped, instead create_problem_list() should be called
with a proper directory argument ...
(That's only for the reporting side of the problem, and the patch that
made me aware of the possibility to have different known_problem sets.)


 As you are saying, if this was designed from scratch for integration with
 piuparts-report, it would lean much more heavily on packagesdb. What is on
 the table is not an integrated solution. It is a replacement for the bash
 script, with issue rdep sorting.

 I understand that you don't like the way that I solved the known_problem
 .conf issues in the patches that come after this submission, and that you
 believe they aren't the right way to add issues to piuparts-report. I am OK

I'm primarily concerned about reimplementing a bad piece of code (the
second half of dwke that creates the .tpl files) in order to build a new
feature on top of it. The perfectionist in me would like to fix things
properly first.

I really do like the approach of reviewing patches before inclusion as
more eyes may spot more problems - and may result in a different
implementation (e.g. recently embedding get_config_value.inl - sourcing
read_config.sh).

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-26 Thread Dave Steele
On Tue, Feb 26, 2013 at 5:50 PM, Andreas Beckmann a...@debian.org wrote:

 In branch report-problem_integration you have
 4db22544 piuparts-report - add known Problems class list to Section.

 That should be dropped, ...

This is where we have been talking past each other for the last week.
That patch is not in sort-issues-by-rdep*. It is not part of the
submission under review.

I know that we are unlikely to agree that fast-report et al are a
valid next step to known_problem conf file integration, and that they
are unlikely to be accepted. If it would help focus the conversation,
I will delete those branches entirely.


 I'm primarily concerned about reimplementing a bad piece of code (the
 second half of dwke that creates the .tpl files) in order to build a new
 feature on top of it. The perfectionist in me would like to fix things
 properly first.


The perfectionist in me also led to this series of patches, which fix
things properly first. You and I are actually on the same sheet of
music when it comes to what this code should ultimately look like. The
only difference, and this is annoyingly minor, is the priority order,
and the resulting path to get there. You are focused on the tpl files
as the next big thing. I focused on linktarget_by_template. That's it.
Given that fact, it's going to be frustrating to toss the code.

The .tpl files come from an impedance mismatch between bash and
python. Consider the current submission of a bash-to-python conversion
a necessary first step on the path to get there.

 I really do like the approach of reviewing patches before inclusion as
 more eyes may spot more problems - and may result in a different
 implementation (e.g. recently embedding get_config_value.inl - sourcing
 read_config.sh).

I agree, and have given you visibility into what I'm working on. I'm
sure you've seen a couple of topics in github that we haven't
discussed yet. That's a good thing.

But, be aware that the current message I am getting is that it is a
bad idea to show you anything beyond small incremental submissions. It
would be immensely helpful to me if you could segment your feedback
based on submission status.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-25 Thread Andreas Beckmann
another random bit I just stumbled upon:

-  if (state == failed-testing and template[-9:] != issue.tpl) \
-  or (state == successfully-tested and template[-9:] == 
issue.tpl):
+  if (state == failed-testing and problem.short_name[-5:] != 
issue) \
+  or (state == successfully-tested and problem.short_name[-5:] 
== issue):

What's wrong with problem.short_name.endswith(issue) ?


maybe even worse:

  return( logpath[:-4] + KPR_EXT )



In general I think we should allow the flexibility to have a per-section
known-problems-directory setting, so each report Section should generate
its own problem list and not get a global one passed


I tried to create a reduced version of Dave's sort-issues-by-rdep branch
that only does the .tpl generation, as that is the part I want to look
at right now:

preview/dave-dwke-only-create-tpl

David Steele (9):
  01 Add skeleton for python replacement of detect-well-known-errors
  02 detect_well_known_errors.py - Clean obsolete kpr and bug files.
  03 detect_well_known_errors.py - Add class for handling known problems.
  05 detect_well_known_errors.py - Create missing kpr files.
  06 detect_well_known_errors.py - Create Failure Mgr class to hold kpr 
fails.
  07 detect_well_known_errors - Create html tpl files.
  16 detect_well_known_errors - Sort known errors/issues by rdep count.
  17 detect_well_known_errors - Display the reverse dependency count.
  20 detect_well_known_errors - Add PTS link to issue/error entries.

reordered, merged, dropped .kpr creation, cleanup of obsolete files, ...
but not tested at all

The primarily interesting one is 07 that should be integrated into
piuparts-report directly to avoid the .tpl intermediate step

The problems I see right now:

* many functions from piuparts-report are either copied
 (e.g. pts_subdir( source ))
* or reimplemented differently, e.g. the variable substitution
  in the templates. I don't know which variant is better, but 
  I don't really want *two* implementations of the same thing

The internal representation of a set of logs is very different
which makes integration into -report difficult

-report uses 
logs_by_dir = {}
for vdir in dirs:
logs_by_dir[vdir] = find_files_with_suffix(vdir, .log)


-report works with cwd=sectiondir that should make paths shorter and
less difficult

The assumption that there is only $pkgspec.log in (at most) one
subdir is nothing I would rely on (although it usually is)


BTS and PTS URLs should not be embedded in the templates, probably best
to have a function that generates a certain url for a package name
to allow for future extensions, e.g. Ubuntu support.


Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-25 Thread Dave Steele
On Mon, Feb 25, 2013 at 8:45 AM, Andreas Beckmann a...@debian.org wrote:


 In general I think we should allow the flexibility to have a per-section
 known-problems-directory setting, so each report Section should generate
 its own problem list and not get a global one passed


OK, but out of scope of the patch set under consideration, which replaces
the existing detect_well_known_errors with one that sorts by rdep.



 I tried to create a reduced version of Dave's sort-issues-by-rdep branch
 that only does the .tpl generation, as that is the part I want to look
 at right now:

 preview/dave-dwke-only-create-tpl

 David Steele (9):
   01 Add skeleton for python replacement of detect-well-known-errors
   02 detect_well_known_errors.py - Clean obsolete kpr and bug files.
   03 detect_well_known_errors.py - Add class for handling known
 problems.
   05 detect_well_known_errors.py - Create missing kpr files.
   06 detect_well_known_errors.py - Create Failure Mgr class to hold
 kpr fails.
   07 detect_well_known_errors - Create html tpl files.
   16 detect_well_known_errors - Sort known errors/issues by rdep count.
   17 detect_well_known_errors - Display the reverse dependency count.
   20 detect_well_known_errors - Add PTS link to issue/error entries.

 reordered, merged, dropped .kpr creation, cleanup of obsolete files, ...
 but not tested at all


Take a look at skip_kpr. It gives you your tpl-only capability with about a
dozen lines of code. This is part of the piuparts-report work I originally
submitted, which is out of scope for the patch set under consideration.



 The problems I see right now:

 * many functions from piuparts-report are either copied
  (e.g. pts_subdir( source ))
 * or reimplemented differently, e.g. the variable substitution
   in the templates. I don't know which variant is better, but
   I don't really want *two* implementations of the same thing


This is not a change from what it replaces. Elimination of the redundancies
can be added to the scope of a piuparts-report integration task.



 The internal representation of a set of logs is very different
 which makes integration into -report difficult


That depends on what you mean by integration. There is validity to the
claim that it has been integrated, in existing patches outside the current
scope.

As you are saying, if this was designed from scratch for integration with
piuparts-report, it would lean much more heavily on packagesdb. What is on
the table is not an integrated solution. It is a replacement for the bash
script, with issue rdep sorting.



 The assumption that there is only $pkgspec.log in (at most) one
 subdir is nothing I would rely on (although it usually is)


It should be a valid assumption. The only requirement along these lines
should be to avoid crashing in the presence of this error condition.




 BTS and PTS URLs should not be embedded in the templates, probably best
 to have a function that generates a certain url for a package name
 to allow for future extensions, e.g. Ubuntu support.


That is a change that is in scope with the future extensions.


I understand that you don't like the way that I solved the known_problem
.conf issues in the patches that come after this submission, and that you
believe they aren't the right way to add issues to piuparts-report. I am OK
with you taking whatever pieces of this you might feel to be useful and
crafting a more elegant integration. But I ask that you consider what's on
the table within the scope of the problem it solves. Please make your
changes for piuparts-report after this is in.


Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-23 Thread Holger Levsen
Hi Dave,

On Samstag, 23. Februar 2013, Dave Steele wrote:
 I've reworked based on Andreas' issues related to
 detect_well_known_errors and rdeps. 

thanks! (extra bonus points if you could tell how many commits it are in each 
branch, due to rebase its rather easy for me to find out, but becoming this 
told would be even better ;)

 Comments related to piupartslib
 and piuparts-reports I've deferred as currently out of scope. The
 problems and failures classes in the python script are available for
 future rework.

nice!

I've seen two typos:

a.) unkownsasfailures.sort - I believe you mean unknownasfailures.sort  :)

b.) Packages with failures not yet well known detected in $SECTION - this 
wording might even be from me, today I'd say: Packages with unknown failures 
detected in $SECTION

Regarding merging into develop: yes, I want. But first I want to finish 
merging Andreas current bits, then merge that develop into piatti (and run it 
there) and then merge these two branches of yours. 


cheers,
Holger


Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-02-23 Thread Dave Steele
On Sat, Feb 23, 2013 at 5:40 AM, Holger Levsen hol...@layer-acht.org wrote:

 extra bonus points if you could tell how many commits it are in
 each branch, due to rebase its rather easy for me to find out, but becoming
 this told would be even better ;)

As I have been keeping up with the changes in develop this week since
your rebase request, those numbers have changed a couple of times.

 Regarding merging into develop: yes, I want. But first I want to finish
 merging Andreas current bits, then merge that develop into piatti (and run
 it there) and then merge these two branches of yours.

Ok. It would have been easier for me if this had been established
before you asked for my rebase.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-02-23 Thread Holger Levsen
Hi,

On Samstag, 23. Februar 2013, Dave Steele wrote:
 Ok. It would have been easier for me if this had been established
 before you asked for my rebase.

I believe Andreas should be done (with his current batch) soon.


cheers,
Holger


Bug#698526: Sort known issues by reverse dependency count

2013-02-22 Thread Dave Steele
On Fri, Feb 22, 2013 at 6:43 AM, Andreas Beckmann a...@debian.org wrote:
 On 2013-02-22 01:58, Dave Steele wrote:
 Concerning what is currently in Holger's queue:



I've reworked based on Andreas' issues related to
detect_well_known_errors and rdeps. Comments related to piupartslib
and piuparts-reports I've deferred as currently out of scope. The
problems and failures classes in the python script are available for
future rework.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Andreas Beckmann
[ Hint: While replying to the BTS, delete the [Piuparts-devel] marker
from the subject as well as any duplicate bug numbers. ]

On 2013-02-21 03:09, Dave Steele wrote:
 On Wed, Feb 20, 2013 at 8:42 PM, Dave Steele dste...@gmail.com wrote:
 On Mon, Feb 18, 2013 at 5:44 AM, Holger Levsen hol...@layer-acht.org wrote:
 ...

 these are quite some different changes, can you please isolate the commits 
 for
 Sort known issues by reverse dependency count and rebase them onto current
 develop?!

 The new serial branches sort-issues-by-rdep and
 sort-issues-by-rdep-fast are separated from the rest of the work, and
 rebased to develop.

Hi,

this work looks really promising and I'm curious to try it some day on
my instance.

But as I wrote before there is no need to reimplement the .tpl
generation in python. Instead these intermediate files should go away
and the html generation should be moved directly into piuparts-report.
There will be a package db available.
I think this requirement to generate .tpl externally dates back to the
time when all logfiles were grepped daily, i.e. before we remembered the
results in .kpr.

Even if .kpr generation can be sped up significantly, I don't think I
want to run this from inside piuparts-report. Just like piuparts-analyze
(that takes 30-60 minutes for my instance) this is something that will
continue to be run from the generate-piuparts-report driver script ...
and having it sped up by a magnitude will decrease my hesitation to run
it with --recheck-all.
Also if the .tpl files are gone, we can actually run piuparts-report
without running piuparts-analyze or detect_well_known_errors directly
before it.

And about speeding up the grepping - wouldn't it be even faster if we
can run multiple regexes at the same time on the input - either by
'ORing' them together or passing a list to re or ... then we would just
need to figure out which one has matched ... (No, I haven't tried
anything like this, but I'm considering testing this with the multiple
grep calls in detect_piuparts_issues.
  grep -lE '(foo)|(bar)|(f[o0]{2}bar|baz)'
should be significantly faster than
  grep -l foo
  grep -l bar
  grep -lE 'f[o0]{2}bar|baz'
And there we only care about 'any match' disregarding which matched.
Or am I mistaken here?

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Andreas Beckmann

+if self.inc_re.search( logbody, re.MULTILINE ):
+for line in logbody.splitlines():
+if self.inc_re.search( line ):
+if self.exc_re == None \
+   or not self.exc_re.search(line):
+return( True )

That looks inefficient. Why do we have to grep twice to identify
matching lines even if we have no exclusion pattern?

Is it for 'foo.*bar' matching on
  'The food shop\n\nSetting up libbar (08-15) ...'
? Hmm, no, DOTALL is off by default.

Anyway, once you have a match, it shouldn't be too difficult to find the
position and identify the matching line without needing to rematch on
each line individually.
Maybe even extend the pattern internally to

^.*($PATTERN)

to match at BeginOfLine, then add a search for '$' starting from the BoL
to find the corresponding EoL ... and apply the exclusion pattern on the
range found that way.

Disclaimer: I don't really have experience with python re

For combining patterns,
  '(foo)|(bar)'
should return something in $1 or $2 depending on what matched (FSVO $1),
that should allow to identify the pattern number, just ensure to
escape all inner parenthesis as (?:...)


Andreas

PS: for reviewing a series of patches I don't really care about the
author's development history but prefer rebased, rewritten and
reordered history to produce an easily readable patch series with small
and self contained patches. (Hint: please fold 'Template HTML format
fix' into the commit it fixes.) Of course rewriting is off limits once
something has been merged into mainline. But I see no gain in merging a
lot of fixup commits into mainline if the development branch could
have been rewritten before the merge.

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Dave Steele
On Thu, Feb 21, 2013 at 4:24 AM, Andreas Beckmann a...@debian.org wrote:

 this work looks really promising and I'm curious to try it some day on
 my instance.

 But as I wrote before there is no need to reimplement the .tpl
 generation in python. Instead these intermediate files should go away
 and the html generation should be moved directly into piuparts-report.
 There will be a package db available.
 I think this requirement to generate .tpl externally dates back to the
 time when all logfiles were grepped daily, i.e. before we remembered the
 results in .kpr.


I took the least invasive path from mimicking detect_well_known_errors
to sorting by rdep to eliminating linktarget_by_template (where rdep
sorting was the single original goal). I agree that .tpl's are
obsolete, but that wasn't an overriding goal for me, and not necessary
to get issue logic out of piuparts-report. There's no significant
performance issue.

 Even if .kpr generation can be sped up significantly, I don't think I
 want to run this from inside piuparts-report. Just like piuparts-analyze
 (that takes 30-60 minutes for my instance) this is something that will
 continue to be run from the generate-piuparts-report driver script ...
 and having it sped up by a magnitude will decrease my hesitation to run
 it with --recheck-all.

OK. A minimally invasive fix would be to add a 'skip kpr creation'
option, used inside piuparts-report, and re-introduce
detect_well_known_errors, which imports known_problems. Interested?

 Also if the .tpl files are gone, we can actually run piuparts-report
 without running piuparts-analyze or detect_well_known_errors directly
 before it.

The above would have the same net effect.

 And about speeding up the grepping - wouldn't it be even faster if we
 can run multiple regexes at the same time on the input - either by
 'ORing' them together or passing a list to re or ... then we would just
 need to figure out which one has matched ... (No, I haven't tried
 anything like this, but I'm considering testing this with the multiple
 grep calls in detect_piuparts_issues.
   grep -lE '(foo)|(bar)|(f[o0]{2}bar|baz)'
 should be significantly faster than
   grep -l foo
   grep -l bar
   grep -lE 'f[o0]{2}bar|baz'
 And there we only care about 'any match' disregarding which matched.
 Or am I mistaken here?


Interesting idea. I'll give it a try.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Dave Steele
On Thu, Feb 21, 2013 at 5:02 AM, Andreas Beckmann a...@debian.org wrote:

 +if self.inc_re.search( logbody, re.MULTILINE ):
 +for line in logbody.splitlines():
 +if self.inc_re.search( line ):
 +if self.exc_re == None \
 +   or not self.exc_re.search(line):
 +return( True )

 That looks inefficient. Why do we have to grep twice to identify
 matching lines even if we have no exclusion pattern?


More than 99% of the tests will return no failure. If the MULTILINE
search is 1% faster than the loop, this is a net win.

 Is it for 'foo.*bar' matching on
   'The food shop\n\nSetting up libbar (08-15) ...'
 ? Hmm, no, DOTALL is off by default.


The MULTILINE search is pure optimization - it can be remove with no
change to the results. DOTALL is off to match grep.

 Anyway, once you have a match, it shouldn't be too difficult to find the
 position and identify the matching line without needing to rematch on
 each line individually.
 Maybe even extend the pattern internally to

 ^.*($PATTERN)

 to match at BeginOfLine, then add a search for '$' starting from the BoL
 to find the corresponding EoL ... and apply the exclusion pattern on the
 range found that way.



Maybe, but to get bang for the buck, focus on the 99%.Your idea to
'look for any problem' in the other thread looks like the right path
to try.

There simply aren't enough failure cases (even in 62 sections :-) ) to
worry too much about the rest.


 PS: for reviewing a series of patches I don't really care about the
 author's development history but prefer rebased, rewritten and
 reordered history to produce an easily readable patch series with small
 and self contained patches. (Hint: please fold 'Template HTML format
 fix' into the commit it fixes.)

There is a point to that commit. I wrote the python replacement to
produce identical output to the shell script, before adding fixes and
features (actually there are caveats, listed in the commit). You can
check out that version to verify. Fixing the HTML format and merging
the templates earlier interferes with that capability.

Of course rewriting is off limits once
 something has been merged into mainline. But I see no gain in merging a
 lot of fixup commits into mainline if the development branch could
 have been rewritten before the merge.


I wrote of another fixup branch, containing fixes to errors in the
well_known branch I had previously announced. I'm not sure if I should
have gone ahead and rolled the fixes into the announced branch or not.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Andreas Beckmann
On 2013-02-21 16:35, Dave Steele wrote:
 The MULTILINE search is pure optimization - it can be remove with no
 change to the results. DOTALL is off to match grep.

OK, I didn't realize that the outer search is just for optimization.

 There simply aren't enough failure cases (even in 62 sections :-) ) to
 worry too much about the rest.

Make that 86. There are only 62 graphs.

 I wrote of another fixup branch, containing fixes to errors in the
 well_known branch I had previously announced. I'm not sure if I should
 have gone ahead and rolled the fixes into the announced branch or not.

Just declare this WIP and fold the fixes into the main branch.


Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-20 Thread Dave Steele
On Mon, Feb 18, 2013 at 5:44 AM, Holger Levsen hol...@layer-acht.org wrote:
...

 these are quite some different changes, can you please isolate the commits for
 Sort known issues by reverse dependency count and rebase them onto current
 develop?!

The new serial branches sort-issues-by-rdep and
sort-issues-by-rdep-fast are separated from the rest of the work, and
rebased to develop.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-02-20 Thread Dave Steele
On Wed, Feb 20, 2013 at 8:42 PM, Dave Steele dste...@gmail.com wrote:
 On Mon, Feb 18, 2013 at 5:44 AM, Holger Levsen hol...@layer-acht.org wrote:
 ...

 these are quite some different changes, can you please isolate the commits 
 for
 Sort known issues by reverse dependency count and rebase them onto current
 develop?!

 The new serial branches sort-issues-by-rdep and
 sort-issues-by-rdep-fast are separated from the rest of the work, and
 rebased to develop.


I should mention that there is the line

PROBLEM_DIR = os.environ['HOME'] + /bin/known_problems

in the code. I run a stock install - that line is modified in my test branch.

PROBLEM_DIR will need to default to
/usr/share/piuparts/master/known_problems/ to clear the 0.50 TODO
list.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-02-18 Thread Holger Levsen
Hi Dave,

On Sonntag, 27. Januar 2013, Dave Steele wrote:
 The rest of my proposed changes for known problem handling are pushed,
 for review.
 A rebase is needed before merging. I will do this at your request.
 
 
 The following serial branch heads are involved:
 
 well-known - I've added tolerance for missing files and packages, and
 added PTS links
 
 fast-problems - replaced grep shell calls with python re. Per the commit:
 
 Run with full .kpr replacement is 2 1/2 minutes vs. 28 minutes for
 grep, per section, with stale file buffers, and idle slaves.
 Subsequent runs are 15 seconds vs. 60 seconds. Replacing the
 packagesdb rdep sort with an alpha sort reduces that to 5 seconds.
 
 fast-report - detect_well_known_errors is morphed into the piupartslib
 module 'known_problems', and is and called from piuparts-report. Report
 runs always include
 issues and error summaries now.
 
 report_problem_integration - replace linktarget_by_template with
 known_problem module support. All problem definition information is
 encoded in the conf file.
 
 piatti-problems - known_problems uses the packaged dir for the problem
 files. A new known-problem-directory config parameter lets piatti set it
 back to under /org

these are quite some different changes, can you please isolate the commits for 
Sort known issues by reverse dependency count and rebase them onto current 
develop?!


cheers,
Holger


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-27 Thread Dave Steele
The rest of my proposed changes for known problem handling are pushed,
for review.
A rebase is needed before merging. I will do this at your request.


The following serial branch heads are involved:

well-known - I've added tolerance for missing files and packages, and
added PTS links

fast-problems - replaced grep shell calls with python re. Per the commit:

Run with full .kpr replacement is 2 1/2 minutes vs. 28 minutes for
grep, per section, with stale file buffers, and idle slaves.
Subsequent runs are 15 seconds vs. 60 seconds. Replacing the
packagesdb rdep sort with an alpha sort reduces that to 5 seconds.

fast-report - detect_well_known_errors is morphed into the piupartslib module
'known_problems', and is and called from piuparts-report. Report
runs always include
issues and error summaries now.

report_problem_integration - replace linktarget_by_template with known_problem
module support. All problem definition information is encoded in
the conf file.

piatti-problems - known_problems uses the packaged dir for the problem files. A
new known-problem-directory config parameter lets piatti set it
back to under /org.

Commits:

piatti-problems
1b61655 piuparts-report - Add known-problem-directory config for piatti.
report_problem_integration
a8360ec piuparts-report - Add a special Problem case for unknown failures.
dc39b89 piuparts-report - replace linktarget_by_template with Problem class.
4db2254 piuparts-report - add known Problems class list to Section.
108bbfd Add piuparts-report linktarget_by_template information to
known_problems.
fast-report
bdc0939 Mv detect_well_known_errors to piupartslib - call from
piuparts-report.
fast-problems
4394b8f detect_well_known_errors - Changelog entry for re speedup.
c398289 Remove COMMAND parameter from known_problems.
4e4e011 detect_well_known_errors - Generate 'grep' help command
from INCLUDE.
87696e9 detect_well_known_errors - Use python re for fast kpr generation.
2991e15 known_problems - Add INCLUDE parameters for re-based searching.
well-known
955a6a2 Close the 698526 python detect_well_known_errors wishlist bug.
5b61a03 detect_well_known_errors - Add PTS link to issue/error entries.
fe4e400 detect_well_known_errors - handle having the pkgsdb entry disappear.
895e035 detect_well_known_errors - Tolerate missing .kpr files.
967e27d detect_well_known_errors - Tolerate deleted log files.
427aa41 detect_well_known_errors - restore recheck and
recheck-failed options.
a4553bc Bump the required python version to 2.7.
a12f676 detect_well_known_errors - display the reverse dependency count.
500e97f detect_well_known_errors - sort known errors/issues by rdep count.
7de4eb9 detect_well_known_errors - integrate the package templates.
d066bb3 detect_well_known_errors - Template HTML format fix.
76b8ce2 detect_well_known_errors - Copyright notice.
b8af3e4 detect_well_known_errors.py - move to detect_well_known_errors.
25b9351 Remove bash detect_well_known_errors.
ece5e4e detect_well_known_errors.py - change ext's to create kpr
and tpl files.
39837e9 detect_well_known_errors.py - print failures to match bash script
198c65e detect_well_known_errors - Create html tpl files
4049338 detect_well_known_errors.py - Create Failure Mgr class to
hold kpr fails.
d04c1bb detect_well_known_errors.py - Create missing kpr files
1880598 detect_well_known_errors.py - add class for handling known problems
9b25943 detect_well_known_errors.py - establish the problem file location.
53df049 detect_well_known_errors.py - clean obsolete kpr and bug files
cdd8803 Add skeleton for python replacement of detect-well-known-errors
601e6a7 start with 0.50
df94975 release as 0.49

In addition, there is a fixup branch that contains changes that need
to be rebased into
well-known.

fixup-well
8fc8df2 fixup - fix method arguments for recheck* parameters
b897262 fixup - fix filtered()
a1b381d fixup - delete the comment that .kprn is temporary


Andreas, per your wishlist:

On Sun, Jan 20, 2013 at 7:56 AM, Dave Steele dste...@gmail.com wrote:
 On Sun, Jan 20, 2013 at 6:56 AM, Andreas Beckmann deb...@abeckmann.de wrote:
 ...

 What I'd like to see is (in probable order of implementation)
 * piuparts-report discovering all existing known problem descriptions
 instead of hardcoding them

Done, by pulling in the detect_well_known_errors code as a module, and
using it's Problem
class.

   - need to add ordering information somehow, perhaps by adding a
 number prefix:  42_foo_not_found_issue.conf
 or by adding a variable with a sort key inside
 (there should be a bug or some todo entries about this)

Done, using a PRIORITY key in the problem files, seeded with the order
of linktarget_by_template in piuparts-report.

   - needs to move title information from piuparts-report to .conf

Done, using a new EXPLAIN field 

Bug#698526: Sort known issues by reverse dependency count

2013-01-20 Thread Andreas Beckmann
On 2013-01-20 04:02, Dave Steele wrote:
 Yes, but it would involve duplicating a bit of code from
 piuparts-report. What are you thinking,  replace e.g.
 pass/python-support_1.0.15.log with pass/python-support_1.0.15, and
 link to the source page instead of the log?

I just want to extend the current format to

state/package_version.log (PTS) (BTS) #123456...

although the ordering may be changed:

PTS BTS LOG #bugs

or whatever seems to be best in a usability way


When analyzing the logs, the PTS access is something I need more often
than the BTS, now I have to go through BTS page first ...

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-01-20 Thread Holger Levsen
Hi,

On Samstag, 19. Januar 2013, Andreas Beckmann wrote:
 Without having looked at the code yet, I like the idea 
:-)

same here :)
 
 Now that you have access to the package DB, can you 
add a PTS link for
 each failing package? These need to be src based ...

I'd prefer this as well...


cheers,
Holger


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-01-20 Thread Andreas Beckmann
Hi,

thinking about this again, there are currently two tasks performed by
detect_well_known_errors:

1. generating .kpr files
2. generating .tpl files

(1) is the really time comsuming part and needs to be run independently
from piuparts-report from time to time (with the recheck options ...),
so it needs to stay in a separate script.
On the other hand, I think (2) should better be integrated with
piuparts-report - making the intermediate .tpl file superfluous while
reusing the packagedb with dependency counts that is already there.

A known problem specification is currently something like
* a set of patterns (grep foo | grep bar | grep -v baz | grep -v blah)
  (processing them with re instead of repeated grep calls sounds like a
  good longterm goal)
* header, description (in .conf), title (in piuparts-report)
* ordering information (in piuparts-report)
* an indication where to look (error or issue) (repeated three times:
  *_{error,issue}.conf, WHERE, ISSUE)

Then we repeat most of them a second time with slightly changed
header/title and error/issue exchanged ...

There is a little special case: the unknown failures.

What I'd like to see is (in probable order of implementation)
* piuparts-report discovering all existing known problem descriptions
instead of hardcoding them
  - need to add ordering information somehow, perhaps by adding a
number prefix:  42_foo_not_found_issue.conf
or by adding a variable with a sort key inside
(there should be a bug or some todo entries about this)
  - needs to move title information from piuparts-report to .conf
* piuparts-report generating the known problem reports, allowing access
  to packagedb etc. for better reports, making .tpl files obsolete
* getting rid of error/issue redundancies
* computing the .kpr with python re instead of grep
* adjusting the .conf and .kpr formats to what is actually needed

For performance reasons directory content should be cached heavily (e.g.
use listdir() exactly once, avoid exists() etc., maybe LogDB can be
reused). Be aware that files (especially logfiles) may disappear at any
point in time - catch and ignore.

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-20 Thread Dave Steele
On Sun, Jan 20, 2013 at 6:56 AM, Andreas Beckmann deb...@abeckmann.de wrote:
...

 What I'd like to see is (in probable order of implementation)
 * piuparts-report discovering all existing known problem descriptions
 instead of hardcoding them
   - need to add ordering information somehow, perhaps by adding a
 number prefix:  42_foo_not_found_issue.conf
 or by adding a variable with a sort key inside
 (there should be a bug or some todo entries about this)
   - needs to move title information from piuparts-report to .conf
 * piuparts-report generating the known problem reports, allowing access
   to packagedb etc. for better reports, making .tpl files obsolete
 * getting rid of error/issue redundancies
 * computing the .kpr with python re instead of grep
 * adjusting the .conf and .kpr formats to what is actually needed


I would prioritize python re. The results could affect the strategy
for the rest.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: Sort known issues by reverse dependency count

2013-01-19 Thread Dave Steele
Package: piuparts
Severity: wishlist
Tags: patch
thanks

Packages with high reverse dependency counts can cause known problem
issue lists to balloon. Providing rdep visibility in the issue report
can highlight these problems, making it much easier to pare the list
down.

The well-known git branch implements a version of
detect_well_known_errors to accomplish this. The script is ported from
bash to python, to take advantage of the rdep capability of
piupartsdb. It was developed alongside the bash script to support
side-by-side testing.

https://github.com/davesteele/piuparts/commits/well-known

427aa41 detect_well_known_errors - restore recheck and recheck-failed options.
a4553bc Bump the required python version to 2.7.
a12f676 detect_well_known_errors - display the reverse dependency count.
500e97f detect_well_known_errors - sort known errors/issues by rdep count.
7de4eb9 detect_well_known_errors - integrate the package templates.
d066bb3 detect_well_known_errors - Template HTML format fix.
76b8ce2 detect_well_known_errors - Copyright notice.
b8af3e4 detect_well_known_errors.py - move to detect_well_known_errors.
25b9351 Remove bash detect_well_known_errors.
ece5e4e detect_well_known_errors.py - change ext's to create kpr and tpl files.
39837e9 detect_well_known_errors.py - print failures to match bash script
198c65e detect_well_known_errors - Create html tpl files
4049338 detect_well_known_errors.py - Create Failure Mgr class to hold kpr fails
d04c1bb detect_well_known_errors.py - Create missing kpr files
1880598 detect_well_known_errors.py - add class for handling known problems
9b25943 detect_well_known_errors.py - establish the problem file location.
53df049 detect_well_known_errors.py - clean obsolete kpr and bug files
cdd8803 Add skeleton for python replacement of detect-well-known-errors
601e6a7 start with 0.50
df94975 release as 0.49


Here's a partial text dump of the resulting broken_symlinks issue html
output (note the extra rdep parameter for each package):

===

Packages which have logs with the string Broken symlinks in sid,
sorted by reverse dependency count.

This is clearly an error, but as there are too many of this kind,
piuparts can be configured to not fail if it detects broken symlinks.
Another option is not to test for broken symlinks. See the piuparts
manpage for details.

The commandline to find these logs is:

COMMAND='grep -E (WARN|FAIL): Broken symlink'

Please file bugs!

pass/python-support_1.0.15.log (1701) (BTS)
pass/dictionaries-common_1.12.10.log (1343) (BTS)
pass/aspell_0.60.7~20110707-1.log (1107) (BTS)
pass/aspell-en_7.1-0-1.log (1043) (BTS)
pass/libenchant1c2a_1.6.0-7.log (1042) (BTS)
pass/python-numpy_1:1.6.2-1.log (765) (BTS)
pass/vlc-data_2.0.5-1.log (615) (BTS)
pass/libvlccore5_2.0.5-1.log (614) (BTS)
pass/libvlc5_2.0.5-1.log (612) (BTS)
pass/vlc-nox_2.0.5-1.log (604) (BTS)
pass/phonon-backend-vlc_0.6.0-1.log (585) (BTS)
pass/phonon_4:4.6.0.0-2.log (583) (BTS)
pass/python-cairo_1.8.8-1+b2.log (562) (BTS)
pass/python-gtk2_2.24.0-3.log (509) (BTS)
pass/kate-data_4:4.8.4-1.log (486) (BTS)
pass/katepart_4:4.8.4-1.log (485) (BTS)
pass/kde-runtime-data_4:4.8.4-2.log (484) (BTS)
pass/kdelibs5-plugins_4:4.8.4-4.log (484) (BTS)
pass/kde-runtime_4:4.8.4-2.log (483) (BTS)
pass/libgs9-common_9.05~dfsg-6.3.log (336) (BTS)
pass/libgs9_9.05~dfsg-6.3.log (335) (BTS)
pass/tex-common_3.15.log (292) (BTS)
pass/texlive-common_2012.20120611-5.log (234) (BTS)
pass/python-simplejson_2.6.2-1.log (219) (BTS)
pass/texlive-doc-base_2012.20120611-1.log (216) (BTS)
pass/texlive-binaries_2012.20120628-4.log (198) (BTS)
pass/luatex_0.70.1.20120524-3.log (193) (BTS)
pass/texlive-base_2012.20120611-5.log (192) (BTS)
pass/libkresources4_4:4.8.4-2.log (183) (BTS)
pass/libkabc4_4:4.8.4-2.log (182) (BTS)
pass/libkcal4_4:4.8.4-2.log (180) (BTS)



As a side effect, .tpl file creation is about 30X faster, down from
about 1 1/2 minutes. Much of that improvement is used up by the rdep
sort calculation in piupartsdb. .kpr creation is still very slow. Some
serious additional speed improvements can be achieved by 1) replacing
the subprocess shell calls for grep with re, and 2) incorporating
detect_well_known_errors into piuparts-report, allowing a piupartsdb
object to be shared between the two.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-19 Thread Andreas Beckmann
On 2013-01-19 22:06, Dave Steele wrote:
 The well-known git branch implements a version of
 detect_well_known_errors to accomplish this. The script is ported from
 bash to python, to take advantage of the rdep capability of
 piupartsdb. It was developed alongside the bash script to support
 side-by-side testing.

Without having looked at the code yet, I like the idea :-)

Now that you have access to the package DB, can you add a PTS link for
each failing package? These need to be src based ...

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Bug#698526: Sort known issues by reverse dependency count

2013-01-19 Thread Dave Steele
On Sat, Jan 19, 2013 at 5:19 PM, Andreas Beckmann deb...@abeckmann.de wrote:
 On 2013-01-19 22:06, Dave Steele wrote:
...

 Now that you have access to the package DB, can you add a PTS link for
 each failing package? These need to be src based ...


Yes, but it would involve duplicating a bit of code from
piuparts-report. What are you thinking,  replace e.g.
pass/python-support_1.0.15.log with pass/python-support_1.0.15, and
link to the source page instead of the log?


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org