Mike Kupfer <mike.kupfer at sun.com> writes:

> I just finished going over the lists in Bugzilla (bugs.grommit.com) and
> Bugster.  Stephen and a couple other senior engineers are particularly
> interested in what needs to be fixed before we can think about moving
> SFW, so my comments are biased in that direction.  There appear to be
> several clumps of bugs:
>
>   - Mercurial Issues
>   - gpyfm
>   - Recommit
>   - Checks and Gate Hooks
>   - Incremental Merge
>   - Webrev
>   - SCCS Keywords
>   - Old Style, Copyright, etc Issues
>   - Miscellaneous
>
> so I've organized my comments along those lines.  
>
> Bugs tagged with "#" do not have an owner.
>
> The bugs in Bugster are all marked with the "hg_trans" keyword.
>
> * Mercurial Issues
>
>   There are several bugs fixed in 0.9.5, plus there are other issues
>   that are stalled waiting for 0.9.5.  Rich and Danek hooked up over IRC
>   on Friday, and it looked like they will be moving forward on getting
>   0.9.5 into SFW, after Danek finishes up with p7zip.
>
>   Rich mentioned that changing to 0.9.5 may cause some problems for the
>   Xen folks, but I'm still a little unclear on the details.  I'm hoping
>   it just means they need to use matching versions of Cadmium and
>   Mercurial.  Cadmium will issue a warning if detects a version of
>   Mercurial that it's not sure about.  The bug for the Cadmium changes
>   is
>
>     281  P3  cdm needs to be updated to work under Mercurial 0.9.5

I have this, obviously, as well as the changes to SFW that I discussed
with yourself and Danek. 

>   We'll be able to mark these issues as resolved once 0.9.5 is in place:
>
>     283# P1  Mercurial issue 612: "rename of a directory with built ob...
>     354# P1  mercurial issue762: git format diffs are confused by sour...
>     246# P2  Mercurial issue 222: [extensions] in a repo's .hg/hgrc fi...
>     301# P2  Mercurial issue 636: merge may use the wrong file content...
>     272# P3  Mercurial Issue 589: "undelete" sequence leads to crash
>     242# P3  Mercurial issue 455: Mismerge involving two concurrently ...
>     243# P3  Mercurial issue 498: unreliable tag computation when tags...
>     325# P3  Mercurial issue 672: hg merge doesn't find renames after ...
>     252# P3  Mercurial issue 522: hg st of a merge changeset shows pha...
>     323# P3  Mercurial issue 650: Unmodified file considered modified
>     249# P3  Mercurial issue 269: hg mv should handle newly added files
>
>   Until then, you can filter them out of the bug list by telling
>   Bugzilla to ignore bugs with the "fixesupstream" keyword.
>
>   This leaves 6 unresolved Mercurial issues.  The first 3 involve
>   the possibility of repo corruption:
>
>     241# P3  Mercurial issue132: hg should revalidate its data after l...
>     244# P3  Mercurial issue199: detect corrupted dirstate
>     251# P3  Mercurial issue 413: hg add mishandles subprojects
>
>   241 is marked as "in-progress", the other 2 are still being discussed
>   (status "chatting").  251 is not relevant to SFW.  241 and 244 are
>   relevant but relatively unlikely to be triggered.
>
>   The next 2 are both marked "chatting".
>
>     248# P3  Mercurial issue 122: commit succeeds if merge fails
>     324# P3  Mercurial issue 664: hg process may exit with an apparent...
>
>   Bug 248 is P3 because it makes mismerges easier.  See "Incremental
>   Merge" below.
>
>   Bug 324 is P3 because of the risk that our tools will misbehave,
>   particularly on large putbacks.  
>
>   If we want these 5 issues fixed any time soon, we will probably need
>   to work on them, particularly the ones that are marked "chatting".
>
>   The last Mercurial bug is
>
>     347  P1 recommit can strip local changes in such a way as to corr...
>
>   which is Mercurial issue 764, which has status "chatting".  I'm not
>   sure we need to spend time on this--it may go away when we switch to
>   Rich's new recommit code (see "Recommit" below).

It won't, I've made several changes that make it less likely, but it
won't go away.  When I was talking to the mercurial guys about this,
one of them came up with a script (attached to their bug) that breaks
it in ever combination and permutation of revision order without doing
anything particularly unusual (no use of strip to cause the problem,
etc).

If we want this fixed, we should either try and fix it ourselves, or
continue to try and talk them into fixing it.  I've made attempts, but
it's generally problematic to do well.  (my attempts regressed
performance in the common case by more orders of magnitude than I care
to recall).

> * gpyfm
>
>   There is at least one serious gpyfm bug that we need to fix, unless we
>   expect people to use filemerge for graphical merges:
>
>     271  P1 gpyfm gets seriously confused by merges of near identical...
>
>   There are a couple issues with gpyfm's user interface:
>
>     203  P3 gpyfm should ignore identical changes in parent and child
>     346# P3 multiple Accept&Next clicks don't work

It's unclear that #346 is still present in the last wad I saw from
Dave, as the button itself went away.  The performance in the large
file case was reportedly markedly improved, so if this bug was the
same as bug #238, it *should* be also fixed in that wad (though I
haven't verified.)

As discussed with Mike a few days ago, I intend to either move gpyfm
out of our gate to a separate gate in our project, or disconnect it
entirely from the build, (whichever is easier for *me*) at the soonest
opportunity, unless someone either tells me to do otherwise, or
puts back a reviewed and substantially fixed wad.  The current one is
too dangerous to recommend for use, or even make easy for use.

People may use filemerge, or whatever else, in the interim.  (there is
a note on the wiki docs page about the use of filemerge with Hg, I
have tried it, and it works, although it is very slightly more
annoying than the teamware usage).

>   I'm not sure 203 needs to be P3.  I think gpyfm handles identical
>   changes correctly if you click on the merge button.  But it does make
>   it more tedious if you walk through the conflicts one at a time.

It's possible it was initially P3 due to confusion with #271, or the
fact we weren't being uniform about priority at that point.  Feel free
to adjust it (I believe I filed it).
>
>     238  P2 gpyfm's responsiveness decreases as file size increases
>
>   This bug appears to be mostly a user feedback issue--there's no
>   indication (e.g., hourglass cursor) that the tools is busy.  I
>   question why it's a P2: the bug indicates a 1-2 second delay for a
>   28K-line file.  I don't think 1-2 seconds is unmanageable, and I've
>   found few files in ON that are larger (counting number of lines).  P4
>   may be more appropriate.  OTOH, this could also be the root cause of
>   346 (above), which has delays significantly longer than 1-2 seconds.

Because the lack of feedback is such that it confuses users, I still
maintain that the #346 you filed is in fact duplicate of this.  If it
confused you, it'll confuse everyone.  Thus, P2, and blocking integration.

> * Recommit
>
>   The current recommit code has some bugs, and my understanding is that
>   it's a big complicated and hard to get right.  It also tickles a bug
>   in Mercurial (see bug #347 above).  Rich has a simpler
>   reimplementation in mind that should be more reliable.
>
>     333  P1 cdm managed to turn a rename into a copy

That is bandaided in the latest wad, such that you are prompted to
clobber the files that would otherwise be problematic.  It is also, as
you said, not a problem with newreci (which clobbers the files anyway).

>     356  P3 cdm recommit needs to cope with the tags it may be
>          trashing

I have changes here, but I'm still not sure the semantics are entirely correct.

>     368  P3 hg recommit with no actual changes confuses the mortals
>     377  P3 cdm should not allow you to putback a null changeset.

These two are up for review, as you say, though if we have senior
engineers and such asking questions, perhaps they should answer the
questions I asked re: #377 in that mail (<x7ejf44qmu.fsf at richlowe.net>).
     
>   The fix for 333 is the reimplementation.  
>
>   356 will still need to be addressed, even under the new
>   implementation.  I think 368 and 377 will still be needed, too.  Rich
>   has fixes for 368 and 377 in code revew.
>
>   There's one more bug that is related to recommit:
>
>     259  P3 cdm needs a 'wx reset' equivalent.

That's an RFE that I think Vlad is working on, though it's also
slightly more complex than it first appears.  It's not recommit
related, it's merely user convenience.

>   I think all of these bugs (356, 368, 377, and 259) are high P3s.  That
>   is, if we don't fix them for SFW, we'll definitely want to document
>   them.
>
> * Checks and Gate Hooks
>
>   These two do not appear to be interesting for SFW:
>
>     303  P3 cdm needs .NOT file equivalents

That's somewhat useful, in that they don't care about the vast
majority of things pbchk and nits will complain about.

>     373  P3 308 didn't do enough for pbchk and nits

That is perhaps fairly minor to them, but I also have the fix for that
in review too.

>   I'm expecting that SFW may want RTI enforcement as well as
>   sanity-checking the permissions on a file.  So these bugs are all
>   relevant to SFW:
>
>     352# P2 Need gate-side hooks to replace the gate pbcheck.

That is the (separate) set of hooks that Ken was working on for them,
rather than the mercurial hooks themselves.  RTI hooks are, of course,
the responsibility of someone other than me, per several of our
conversations and one conversation with Stephen where I suggested
setting it on fire and dancing around it chanting and celebrating.

>     261  P2 Need a permchk (Permissions Check) check module

Vijay has done this, and I've reviewed all iterations of it, I was
under the impression he was going to put it back.

>     306  P3 cdm should whine at people about checks as early as is se...

This is a complete and utter pain in my ass, for various reasons.  The
moment I can find a way to deal with it that doesn't leave some group
or other angry, I will, but don't hold your breath, right now.

>     369# P3 cadmium check output is garbled with -q

Per our conversation, I'm still not certain whether this is garbelled,
or correct.

>     370# P3 RTI checks should tell me I can't connect, or otherwise f...

I can't fix that either, given my prior adventures in that code have
been so disastrous, thanks to me being able to run it.  It (and the
XML thing we'd talked about) need to be done by somebody else.  I'm
happy to review it, however.

>   If we don't fix 370 for SFW, we'll want to document it.

It should be fixed.  Documenting things as broken when we could fix
them with the same degree of ease is bad and wrong.

>   And there are a couple checks I expect every gate will want:
>
>     355# P3 scm hooks should prevent the creation of new tags or bran...

I have this in my python gate wad (which is far from complete), it's
trivial to add to pbchk and such too.  It has the same issues as the
#306 hooked on the client side.

>     351# P2 gate automation/gk tools need to understand mercurial.

That, and as I understood it checking of the SFW gate is something Ken
is doing via parsing mail, and all that fun.

>   355 probably ought to be P2.

I guess I'll do that (for pbchk/nits) on the client then.  See above
regarding their status as gate enforcement. 

>   I couldn't find a bugzilla login for Ken, so 351 and 352 are currently
>   marked unowned.  But I believe they're both his.

#352 is something stevel and then myself were working on, at least to
demonstrate the concept for ON.  What Mike and Ken agreed was right
for SFW was not what Dave agrees is right for ON.

I've been keeping my set of changes for that alive for partly that
reason (and the fact it contains a couple of other fixes).

>   There's also a bug in the existing gate checks that is relevant to SFW:
>
>     6598784# P3 pushing to locked repositories should result in sane 
> error[...]
>
>   If we don't fix it for SFW, we'll want to document it.

I offered to fix that if someone would give me the info I needed, I
got code but not information.  As such, someone on the tonic team must
fix it.  My comment regarding documenting trivial bugs rather than
fixing them applies here to.

> * Incremental Merge
>
>   The primary Mercurial merge model is that you merge all conflicts in
>   one session.  This will be a problem for large merges, particularly
>   ones where it's not feasible for a project gatekeeper to do all the
>   merges.
>
>     269# P2 support for interrupting a merge session
>
>   We haven't decided how to handle this issue.  There's an extension
>   called IMerge that was created when we pointed out this problem on the
>   Mercurial project list.  It seems to be usable, though not ideal.  (Of
>   course, we can always work with IMerge's author to get it more to our
>   liking.)  Another option would be to rely on hgr.

Making it work slightly more to our liking maybe easy, and something
we could (and should!) do.

>     309  P3 integrate dep's hgr (text mode 'resolve'-like merge tool)
>
>   One concern that we have about both approaches is how to make
>   accidental mismerges hard.
>
>   I think we need some sort of incremental merge capability for SFW.
>   I don't think protection against accidental mismerges is a stopper for
>   SFW.

I disagree, and as I've said before, I will not integrate hgr without
some form of protection against that.  The question is merely
regarding its veracity.

> * Webrev
>
>   These do not appear to be stoppers for SFW:
>
>     374  P3 setting CODEMGR_WS confuses webrev

I'll look at this soon, if I really have to.

>     372  P3 webrev does not display executable bit change

It never did for Teamware, which also propagated these.  This is P3 why? 

>     326  P3 webrev could shake off its wx heritage

This is the xVM team wanting the ability to use blank lines in putback
comments.

There's several other things regarding webrev formatting that are
similarly irritating when they come up against people using entirely
free-form comments.  My current feeling is that if wx would shaft them
for it, we can too, though as yet we don't.

>   There's also a bug in Bugster:
>
>     6446689  P3 webrev(1) needs Mercurial support
>
>   I don't know whether it's redundant.  I'll ping the RE (Darren Moffat)
>   about it.

It's redundant, assign it to yourself, mark it hg_trans.

>   This one is fixed in our current workspace:
>
>     6524249  P5 webrev doesn't appropriately quote arguments to print(1)

Several solaris/consolidation/os-net-* bugs are fixed in our
workspace, via various means.  I have a list, somewhere, but you'll
also see the 'fixesupstream' keyword used to indicate which of our
bugs do so, and a list of upstream IDs in the comments, suitable to
regenerate that list from.

> * SCCS Keywords
>
>   There is an item in the task list (which is reachable via the project
>   web page) for this.  That item should go away, as we are tracking the
>   keywords issues with bugs.  Most of the bugs are in Bugster.  
>
>   Issues that may be relevant to SFW are
>
>     350# P3 usr/src/tools contains user-visible SCCS keywords.

A fine thing for a volunteer to do!

>     264  P3 revisit nightly -O and friends for Mercurial

You were doing that, right Mike? (I know I can't)

>     268# P2 class action scripts that rely on #ident may do evil things

I did bits and pieces here before slowly losing my mind.  I'll dig
them out and attach them, if I didn't at the time (they were bad, but
better, if that makes sense).

>   Nobody has done the evaluation of 350.  I'm guessing most of the
>   issues are related to displaying version information.  But we probably
>   want to at least evaluate the bug for SFW.
>
>   264 addresses multiple issues, but the only one that is relevant to
>   SFW is the version string for nightly itself that gets put into the
>   mail message.
>
>   We did some analysis for 268 for ON.  If SFW doesn't have any class
>   action scripts, this would obviously not be a stopper for them.

Last I looked, SFW had a handful, mostly duplicate from ON (where
duplicate is meant loosely, in that they were copied from older
versions and not updated as their original was).  Various scripts seem
to be in nearly every consolidation we have, and slightly different in
each...

>   None of the remaining bugs is relevant to SFW.  However, we have not
>   done any analysis on what the impact would be if ON moves and the bug
>   is not yet fixed.
>
>     6560813  P3 USB drivers should not use SCCS keywords in user-visible[...]
>     6560843  P3 asm sources should not rely on .file "%M%" for naming STT[...]
>     6560847  P3 rcm_daemon should not use SCCS keywords in rcm_mod_info 
> strings
>     6576310  P3 postprint uses SCCS keywords for creation and version headers
>
>     6560794# P3 inet modules should not use SCCS keywords in user-visible[...]
>     6560807# P3 common drivers should not use SCCS keywords in 
> user-visibl[...]
>     6560816# P3 x86 drivers should not use SCCS keywords in user-visible [...]
>     6560842# P3 sparc drivers should not use SCCS keywords in 
> user-visible[...]
>     6560957# P3 sendmail should not use SCCS keywords in version info[...]
>     6560958# P3 Solaris:: perl modules should not use SCCS keywords in 
> ver[...]
>     6576312# P3 OBP bootblocks use SCCS keywords in user-visible ways
>     6576316# P3 xntpd uses SCCS keywords to form its version strings.
>     4758439# P4 some files use "current date" sccs keywords
>
> * Old Style, Copyright, etc Issues
>
>   There are several bugs that were opened in the past against wx or the
>   various checking programs.  These will be fixed as part of the SCM
>   Migration work, so they're marked with hg_trans, P4.  They are not
>   needed by SFW.
>
>     4633617  P4 wx copyright is even slower
>     4678979  P4 wx nits doesn't check copyrights in the forthdebug files
>     4739416  P4 copyright checking needs work
>     6252813  P4 wx copyright matching: quoting is hard
>     6313877  P4 wx should check for mixed copyrights
>     6465682  P4 wx should check that PSARC case number matches description

I thought we had more than that.

>
> * Miscellaneous
>
>   This one is not needed by SFW:
>
>     4865419  P2 bfu on i386 needs /ws/on10-gate/public mounted
>
>   Strictly speaking, it's not required for ON, either.  But we had
>   several putbacks in S10 that assumed that BFU users could run scripts
>   in /ws/onv10-gate.  So far we've avoided that in Nevada, but I
>   attribute that mostly to luck.

We have had several instances where it would be preferable
to be able to run the update_* scripts but have been unable to.  I
agree that the fact bfu was not adjusted to try and run them is
largely luck (though I'm sure in some cases people realized what would
happen, and chose to do the right thing).

>   This one is definitely needed by SFW:
>
>     6597716# P3 sfwnv should not change tool file permissions in the gate.

I have a fix for this, still.  We were going to do it when 0.9.5 went
in, in that same wad.  Though I just realized I neglected to send it
to Danek, I did send it to you (at sometime in the past.)  I'll merge
this one, too, and pass it along.

> Comments, anyone?

Mostly inline.  I guess the biggest comment I have is that if people
are pushing to get this done at some kind of breakneck pace they
should volunteer to help so that is possible.

-- Rich

Reply via email to