Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-03-02 Thread Holger Levsen
Hi,

On Dienstag, 26. Februar 2013, Andreas Beckmann wrote:
 I'm primarily concerned about reimplementing a bad piece of code (the
 second half of dwke that creates the .tpl files) in order to build a new
 feature on top of it. The perfectionist in me would like to fix things
 properly first.

yes, but... the imperfect way was used quite successfully with piuparts for a 
long time ;-)
 
 I really do like the approach of reviewing patches before inclusion as
 more eyes may spot more problems

me too, absolutly.

Yet I also can only imagine Dave's frustration trying to get his work in and 
recognized, so far this hasn''t happen for this feature, and for quite a long 
time. And I'd like Dave to stay motivated and contributing, and I like the new 
feature also.

So I'm a bit torn, (currently) leaning towards releasing 0.50 soon (now?) and 
then starting 0.51 with the merge of dave/sort-issues-by-rdep - or do you 
think thats premature?


cheers,
Holger


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-25 Thread Dave Steele
On Mon, Feb 25, 2013 at 8:45 AM, Andreas Beckmann a...@debian.org wrote:


 In general I think we should allow the flexibility to have a per-section
 known-problems-directory setting, so each report Section should generate
 its own problem list and not get a global one passed


OK, but out of scope of the patch set under consideration, which replaces
the existing detect_well_known_errors with one that sorts by rdep.



 I tried to create a reduced version of Dave's sort-issues-by-rdep branch
 that only does the .tpl generation, as that is the part I want to look
 at right now:

 preview/dave-dwke-only-create-tpl

 David Steele (9):
   01 Add skeleton for python replacement of detect-well-known-errors
   02 detect_well_known_errors.py - Clean obsolete kpr and bug files.
   03 detect_well_known_errors.py - Add class for handling known
 problems.
   05 detect_well_known_errors.py - Create missing kpr files.
   06 detect_well_known_errors.py - Create Failure Mgr class to hold
 kpr fails.
   07 detect_well_known_errors - Create html tpl files.
   16 detect_well_known_errors - Sort known errors/issues by rdep count.
   17 detect_well_known_errors - Display the reverse dependency count.
   20 detect_well_known_errors - Add PTS link to issue/error entries.

 reordered, merged, dropped .kpr creation, cleanup of obsolete files, ...
 but not tested at all


Take a look at skip_kpr. It gives you your tpl-only capability with about a
dozen lines of code. This is part of the piuparts-report work I originally
submitted, which is out of scope for the patch set under consideration.



 The problems I see right now:

 * many functions from piuparts-report are either copied
  (e.g. pts_subdir( source ))
 * or reimplemented differently, e.g. the variable substitution
   in the templates. I don't know which variant is better, but
   I don't really want *two* implementations of the same thing


This is not a change from what it replaces. Elimination of the redundancies
can be added to the scope of a piuparts-report integration task.



 The internal representation of a set of logs is very different
 which makes integration into -report difficult


That depends on what you mean by integration. There is validity to the
claim that it has been integrated, in existing patches outside the current
scope.

As you are saying, if this was designed from scratch for integration with
piuparts-report, it would lean much more heavily on packagesdb. What is on
the table is not an integrated solution. It is a replacement for the bash
script, with issue rdep sorting.



 The assumption that there is only $pkgspec.log in (at most) one
 subdir is nothing I would rely on (although it usually is)


It should be a valid assumption. The only requirement along these lines
should be to avoid crashing in the presence of this error condition.




 BTS and PTS URLs should not be embedded in the templates, probably best
 to have a function that generates a certain url for a package name
 to allow for future extensions, e.g. Ubuntu support.


That is a change that is in scope with the future extensions.


I understand that you don't like the way that I solved the known_problem
.conf issues in the patches that come after this submission, and that you
believe they aren't the right way to add issues to piuparts-report. I am OK
with you taking whatever pieces of this you might feel to be useful and
crafting a more elegant integration. But I ask that you consider what's on
the table within the scope of the problem it solves. Please make your
changes for piuparts-report after this is in.


Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-23 Thread Holger Levsen
Hi Dave,

On Samstag, 23. Februar 2013, Dave Steele wrote:
 I've reworked based on Andreas' issues related to
 detect_well_known_errors and rdeps. 

thanks! (extra bonus points if you could tell how many commits it are in each 
branch, due to rebase its rather easy for me to find out, but becoming this 
told would be even better ;)

 Comments related to piupartslib
 and piuparts-reports I've deferred as currently out of scope. The
 problems and failures classes in the python script are available for
 future rework.

nice!

I've seen two typos:

a.) unkownsasfailures.sort - I believe you mean unknownasfailures.sort  :)

b.) Packages with failures not yet well known detected in $SECTION - this 
wording might even be from me, today I'd say: Packages with unknown failures 
detected in $SECTION

Regarding merging into develop: yes, I want. But first I want to finish 
merging Andreas current bits, then merge that develop into piatti (and run it 
there) and then merge these two branches of yours. 


cheers,
Holger


Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Dave Steele
On Thu, Feb 21, 2013 at 4:24 AM, Andreas Beckmann a...@debian.org wrote:

 this work looks really promising and I'm curious to try it some day on
 my instance.

 But as I wrote before there is no need to reimplement the .tpl
 generation in python. Instead these intermediate files should go away
 and the html generation should be moved directly into piuparts-report.
 There will be a package db available.
 I think this requirement to generate .tpl externally dates back to the
 time when all logfiles were grepped daily, i.e. before we remembered the
 results in .kpr.


I took the least invasive path from mimicking detect_well_known_errors
to sorting by rdep to eliminating linktarget_by_template (where rdep
sorting was the single original goal). I agree that .tpl's are
obsolete, but that wasn't an overriding goal for me, and not necessary
to get issue logic out of piuparts-report. There's no significant
performance issue.

 Even if .kpr generation can be sped up significantly, I don't think I
 want to run this from inside piuparts-report. Just like piuparts-analyze
 (that takes 30-60 minutes for my instance) this is something that will
 continue to be run from the generate-piuparts-report driver script ...
 and having it sped up by a magnitude will decrease my hesitation to run
 it with --recheck-all.

OK. A minimally invasive fix would be to add a 'skip kpr creation'
option, used inside piuparts-report, and re-introduce
detect_well_known_errors, which imports known_problems. Interested?

 Also if the .tpl files are gone, we can actually run piuparts-report
 without running piuparts-analyze or detect_well_known_errors directly
 before it.

The above would have the same net effect.

 And about speeding up the grepping - wouldn't it be even faster if we
 can run multiple regexes at the same time on the input - either by
 'ORing' them together or passing a list to re or ... then we would just
 need to figure out which one has matched ... (No, I haven't tried
 anything like this, but I'm considering testing this with the multiple
 grep calls in detect_piuparts_issues.
   grep -lE '(foo)|(bar)|(f[o0]{2}bar|baz)'
 should be significantly faster than
   grep -l foo
   grep -l bar
   grep -lE 'f[o0]{2}bar|baz'
 And there we only care about 'any match' disregarding which matched.
 Or am I mistaken here?


Interesting idea. I'll give it a try.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-21 Thread Dave Steele
On Thu, Feb 21, 2013 at 5:02 AM, Andreas Beckmann a...@debian.org wrote:

 +if self.inc_re.search( logbody, re.MULTILINE ):
 +for line in logbody.splitlines():
 +if self.inc_re.search( line ):
 +if self.exc_re == None \
 +   or not self.exc_re.search(line):
 +return( True )

 That looks inefficient. Why do we have to grep twice to identify
 matching lines even if we have no exclusion pattern?


More than 99% of the tests will return no failure. If the MULTILINE
search is 1% faster than the loop, this is a net win.

 Is it for 'foo.*bar' matching on
   'The food shop\n\nSetting up libbar (08-15) ...'
 ? Hmm, no, DOTALL is off by default.


The MULTILINE search is pure optimization - it can be remove with no
change to the results. DOTALL is off to match grep.

 Anyway, once you have a match, it shouldn't be too difficult to find the
 position and identify the matching line without needing to rematch on
 each line individually.
 Maybe even extend the pattern internally to

 ^.*($PATTERN)

 to match at BeginOfLine, then add a search for '$' starting from the BoL
 to find the corresponding EoL ... and apply the exclusion pattern on the
 range found that way.



Maybe, but to get bang for the buck, focus on the 99%.Your idea to
'look for any problem' in the other thread looks like the right path
to try.

There simply aren't enough failure cases (even in 62 sections :-) ) to
worry too much about the rest.


 PS: for reviewing a series of patches I don't really care about the
 author's development history but prefer rebased, rewritten and
 reordered history to produce an easily readable patch series with small
 and self contained patches. (Hint: please fold 'Template HTML format
 fix' into the commit it fixes.)

There is a point to that commit. I wrote the python replacement to
produce identical output to the shell script, before adding fixes and
features (actually there are caveats, listed in the commit). You can
check out that version to verify. Fixing the HTML format and merging
the templates earlier interferes with that capability.

Of course rewriting is off limits once
 something has been merged into mainline. But I see no gain in merging a
 lot of fixup commits into mainline if the development branch could
 have been rewritten before the merge.


I wrote of another fixup branch, containing fixes to errors in the
well_known branch I had previously announced. I'm not sure if I should
have gone ahead and rolled the fixes into the announced branch or not.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-02-20 Thread Dave Steele
On Mon, Feb 18, 2013 at 5:44 AM, Holger Levsen hol...@layer-acht.org wrote:
...

 these are quite some different changes, can you please isolate the commits for
 Sort known issues by reverse dependency count and rebase them onto current
 develop?!

The new serial branches sort-issues-by-rdep and
sort-issues-by-rdep-fast are separated from the rest of the work, and
rebased to develop.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-27 Thread Dave Steele
The rest of my proposed changes for known problem handling are pushed,
for review.
A rebase is needed before merging. I will do this at your request.


The following serial branch heads are involved:

well-known - I've added tolerance for missing files and packages, and
added PTS links

fast-problems - replaced grep shell calls with python re. Per the commit:

Run with full .kpr replacement is 2 1/2 minutes vs. 28 minutes for
grep, per section, with stale file buffers, and idle slaves.
Subsequent runs are 15 seconds vs. 60 seconds. Replacing the
packagesdb rdep sort with an alpha sort reduces that to 5 seconds.

fast-report - detect_well_known_errors is morphed into the piupartslib module
'known_problems', and is and called from piuparts-report. Report
runs always include
issues and error summaries now.

report_problem_integration - replace linktarget_by_template with known_problem
module support. All problem definition information is encoded in
the conf file.

piatti-problems - known_problems uses the packaged dir for the problem files. A
new known-problem-directory config parameter lets piatti set it
back to under /org.

Commits:

piatti-problems
1b61655 piuparts-report - Add known-problem-directory config for piatti.
report_problem_integration
a8360ec piuparts-report - Add a special Problem case for unknown failures.
dc39b89 piuparts-report - replace linktarget_by_template with Problem class.
4db2254 piuparts-report - add known Problems class list to Section.
108bbfd Add piuparts-report linktarget_by_template information to
known_problems.
fast-report
bdc0939 Mv detect_well_known_errors to piupartslib - call from
piuparts-report.
fast-problems
4394b8f detect_well_known_errors - Changelog entry for re speedup.
c398289 Remove COMMAND parameter from known_problems.
4e4e011 detect_well_known_errors - Generate 'grep' help command
from INCLUDE.
87696e9 detect_well_known_errors - Use python re for fast kpr generation.
2991e15 known_problems - Add INCLUDE parameters for re-based searching.
well-known
955a6a2 Close the 698526 python detect_well_known_errors wishlist bug.
5b61a03 detect_well_known_errors - Add PTS link to issue/error entries.
fe4e400 detect_well_known_errors - handle having the pkgsdb entry disappear.
895e035 detect_well_known_errors - Tolerate missing .kpr files.
967e27d detect_well_known_errors - Tolerate deleted log files.
427aa41 detect_well_known_errors - restore recheck and
recheck-failed options.
a4553bc Bump the required python version to 2.7.
a12f676 detect_well_known_errors - display the reverse dependency count.
500e97f detect_well_known_errors - sort known errors/issues by rdep count.
7de4eb9 detect_well_known_errors - integrate the package templates.
d066bb3 detect_well_known_errors - Template HTML format fix.
76b8ce2 detect_well_known_errors - Copyright notice.
b8af3e4 detect_well_known_errors.py - move to detect_well_known_errors.
25b9351 Remove bash detect_well_known_errors.
ece5e4e detect_well_known_errors.py - change ext's to create kpr
and tpl files.
39837e9 detect_well_known_errors.py - print failures to match bash script
198c65e detect_well_known_errors - Create html tpl files
4049338 detect_well_known_errors.py - Create Failure Mgr class to
hold kpr fails.
d04c1bb detect_well_known_errors.py - Create missing kpr files
1880598 detect_well_known_errors.py - add class for handling known problems
9b25943 detect_well_known_errors.py - establish the problem file location.
53df049 detect_well_known_errors.py - clean obsolete kpr and bug files
cdd8803 Add skeleton for python replacement of detect-well-known-errors
601e6a7 start with 0.50
df94975 release as 0.49

In addition, there is a fixup branch that contains changes that need
to be rebased into
well-known.

fixup-well
8fc8df2 fixup - fix method arguments for recheck* parameters
b897262 fixup - fix filtered()
a1b381d fixup - delete the comment that .kprn is temporary


Andreas, per your wishlist:

On Sun, Jan 20, 2013 at 7:56 AM, Dave Steele dste...@gmail.com wrote:
 On Sun, Jan 20, 2013 at 6:56 AM, Andreas Beckmann deb...@abeckmann.de wrote:
 ...

 What I'd like to see is (in probable order of implementation)
 * piuparts-report discovering all existing known problem descriptions
 instead of hardcoding them

Done, by pulling in the detect_well_known_errors code as a module, and
using it's Problem
class.

   - need to add ordering information somehow, perhaps by adding a
 number prefix:  42_foo_not_found_issue.conf
 or by adding a variable with a sort key inside
 (there should be a bug or some todo entries about this)

Done, using a PRIORITY key in the problem files, seeded with the order
of linktarget_by_template in piuparts-report.

   - needs to move title information from piuparts-report to .conf

Done, using a new EXPLAIN field 

Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-20 Thread Dave Steele
On Sun, Jan 20, 2013 at 6:56 AM, Andreas Beckmann deb...@abeckmann.de wrote:
...

 What I'd like to see is (in probable order of implementation)
 * piuparts-report discovering all existing known problem descriptions
 instead of hardcoding them
   - need to add ordering information somehow, perhaps by adding a
 number prefix:  42_foo_not_found_issue.conf
 or by adding a variable with a sort key inside
 (there should be a bug or some todo entries about this)
   - needs to move title information from piuparts-report to .conf
 * piuparts-report generating the known problem reports, allowing access
   to packagedb etc. for better reports, making .tpl files obsolete
 * getting rid of error/issue redundancies
 * computing the .kpr with python re instead of grep
 * adjusting the .conf and .kpr formats to what is actually needed


I would prioritize python re. The results could affect the strategy
for the rest.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#698526: [Piuparts-devel] Bug#698526: Sort known issues by reverse dependency count

2013-01-19 Thread Andreas Beckmann
On 2013-01-19 22:06, Dave Steele wrote:
 The well-known git branch implements a version of
 detect_well_known_errors to accomplish this. The script is ported from
 bash to python, to take advantage of the rdep capability of
 piupartsdb. It was developed alongside the bash script to support
 side-by-side testing.

Without having looked at the code yet, I like the idea :-)

Now that you have access to the package DB, can you add a PTS link for
each failing package? These need to be src based ...

Andreas


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org