Yes, the intention is to have something maintainable that gets run as part of the regular testing/merging/etc. process -- which means eventually rewriting in python. I wrote this in Java because we had a java lib for automatically creating bugs in the issue tracker, and I wanted to get it up and running quickly. We should have a python API for doing the same thing.
I can easily change the script so it marks DEFERs as P3s. However, currently, the script doesn't alter any entries in the file that already have bugs -- meaning something like "BUG1234 DEFER : ..." or "WONTFIX DEFER : ..." won't change. Is this the correct behavior? I'm planning on running the script later today with the most recent version of test_expectations.txt -- there will be 200+ new bugs. Let me know if I should hold off on doing so. Ojan, I'll send you the review for test_expecations.txt. Regards, Glenn On Fri, Apr 10, 2009 at 4:36 PM, Ojan Vafai <[email protected]> wrote: > I hate to ask this, but any chance we could rewrite it in python? It would > be a lot less code in python (one script file) and would match the rest of > our codebase (which currently does all scripting in python). I'm mainly > worried about usability and maintainability here. > That all said, I think you should go ahead and run this script as a one off > so we can get the test lists having bug IDs ASAP as that's kind of urgent to > us being able to track the current state of layout tests. Then we can worry > about checking in a python version later if we decide it's necessary. > > Pam, Darin, Jon, Mark: You might want to confirm that my request below > makes sense with respect to future handling/triaging of layout tests bugs. > > Also for the initial one-off version, can you make any tests that are > currently marked as DEFER be P3s? Then when you spit out the results to the > test_expectations file, I don't think there's any need to include the DEFER > anymore as we'll rely entirely on bug priorities to decide which tests need > fixing for the next release (i.e. all P1 tests and maybe some P2 tests). > That way there will be less initial triaging to do in order to make sure we > keep the tree shippable. Eventually we'll have to go through all the P3s in > the "Untriaged" state and up some of them to P2s, but there is no pressing > need to do so right away. > > Ojan > > On Fri, Apr 10, 2009 at 3:42 PM, Glenn Wilson <[email protected]>wrote: > >> (Ack...resending) >> Ok, I have the CL ready, if anyone with Java readability would be willing >> to do a review, please let me know. >> more inline... >> >> On Fri, Apr 10, 2009 at 2:07 PM, Pam Greene <[email protected]> wrote: >> >>> >>> >>> On Thu, Apr 9, 2009 at 6:49 PM, Ojan Vafai <[email protected]> wrote: >>> >>>> At a quick glance, this looks great. I didn't look over every bug, but >>>> the ones I did look at look good. >>> >>> >>> Yep. You'll want some sort of default description for the ones that have >>> none. >>> >> >> I ran the script on a small test file...an example bug is here: >> http://code.google.com/p/chromium/issues/detail?id=9991 >> >> There's at least a small bit of indication of where it came from. I also >> added the line numbers from test_expectations.txt that generated the bug. >> Additionally, I also wrote the script to then add the created bug numbers >> to the file. This ends up re-arranging some of the flags (DEBUG DEFER ... >> etc. --> DEFER DEBUG ...), but all data is retained. >> >> >>> 200+ bugs is certainly too many, but that's no reason not to file them. >>> (Sorry, couldn't resist. Seriously, yes, definitely file them. Better in the >>> issue tracker than getting lost.) >>> >> >> There are probably some improvements to be had to better group some of the >> bugs, but it's probably not an order of magnitude improvement. I just >> didn't want to create tons of bugs that were not useful. >> >> >>> >>> >>>> It would be great to check in a version of this script that we could run >>>> when a number of tests fail (e.g. when doing a bad webkit merge). That way, >>>> we can add them all to the local test_expectations.txt file and have it >>>> spit >>>> out the new results. >>>> >>> >>> Fixing the file we have and moving forward are slightly different use >>> cases, but yes. In the long run, we shouldn't need any script to fix an >>> existing bug-less expectations file, only the part that adds bugs for newly >>> added tests. I'm not sure whether that part should be controlled by adding >>> the tests to the file and having the script file bugs, or making it a fully >>> interactive "app": give it a list of files and a description, and it both >>> changes the file and creates a bug. >>> >> >> There may be a short-term solution and a long-term solution. The short >> term solution seems to be for someone (assumedly the merger) to add the >> failing tests to the file, run the script, then commit the file. In the >> long term, this could be one step, or perhaps be driven right from a "roll >> DEPS to new version of WebKit" script...right? >> >> >>> >>>> Really, it would be great if the script filed bugs and then just >>>> modified test_expectations.txt directly (without committing it). Also, the >>>> script should remove any comments it moves into bug descriptions. We should >>>> get to a point where all the comments about layout tests are in the bug >>>> descriptions themselves instead of in this file. >>>> >>> >>> I disagree with this last part. I'd prefer a brief description to remain >>> in the file, with any details in the bug. Certainly that's needed for >>> WONTFIX bugs, where we may not have a bug since there's no work to be >>> tracked, but it's helpful for others too. I've found it frustrating and >>> time-consuming to track down when I see big blocks of failures with no >>> explanations at all. Think of it like the svn checkin comment: enough to >>> have an idea what's going on right there where you need it, with more detail >>> in the bug for when you're really digging. >>> >> >> Yep, I modified the script so that it will change test_expectations.txt to >> have the bug number (you'd just need to commit it afterwards). It's pretty >> easy to for the script to clip out the comments, but I'm reluctant to, >> because the script may be over/under zealous in what comments it associates >> with a layout test. Maybe I can add that behavior as another command-line >> flag. >> >> >>> >>> - Pam >>> >>> >>>> I think it would be good to get the script checked in first and then run >>>> it on the existing test_expectations.txt file. >>>> >>>> Ojan >>>> >>>> >>>> On Thu, Apr 9, 2009 at 6:42 PM, Glenn Wilson <[email protected]> >>>> wrote: >>>> >>>>> Hi Pam & Ojan, >>>>> I wrote a script that would extract all of the layout tests from >>>>> test_expectations.txt that we haven't marked as WONTFIX and don't have a >>>>> bug >>>>> number. I also tried a simple heuristic to get the context of the layout >>>>> test via nearby comments....it's not perfect, and I'll have to change some >>>>> of them by hand, but many of the merge comments are getting picked up. >>>>> >>>>> I've also hooked up our library for creating demetrius bugs, so getting >>>>> bugs made for these should be a matter of running the script with >>>>> different >>>>> arguments (I hope). >>>>> >>>>> What are your thoughts? Is these as descriptive/accurate as we need? >>>>> Is 200+ bugs too many? >>>>> >>>>> Thanks! >>>>> Glenn >>>>> >>>>> >>>>> >>>> >> > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
