Re: [webkit-dev] On returning mutable pointers from const methods
On Oct 26, 2012, at 7:21 PM, Rik Cabanier caban...@gmail.com wrote: On Fri, Oct 26, 2012 at 9:06 AM, Peter Kasting pkast...@google.com wrote: On Fri, Oct 26, 2012 at 8:27 AM, Rik Cabanier caban...@gmail.com wrote: It is valid for a const method to return you a new object ie a const factory object. In that case, const-ness would not be desired. Not really. The point of this thread is that such functions may not modify an object's state themselves, but they vend access that can be used by the caller to modify it. Consider for example: Child* Parent::getNewChild() const; Assuming the Parent doesn't have a list of its children (questionable), we can implement this without mutable pointers. But then a caller can do: Child* child = parent-getNewChild(); child-parent-mutate(); this would only be possible if that parent object is casting away a 'const' somewhere or accessing a global non-const object. Maybe there should be a rule that 'mutable' or 'const_cast' should not be allowed. I don't think that would be a good rule. The approach we try to take to 'const' is logical constness - a method can be const if it has no observable side effect. mutable is extremely useful for state that is not precomputed, but that is worth caching. Filling a cache to give an answer faster next time doesn't logically alter the object's state, even though on a literal level it alters internal state. You shouldn't need a non-const reference to an object just to call a getter that happens to cache its result. The fact that there is a cache at all is a hidden implementation detail. So in cases like this, it's useful to use 'mutable'. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote: I’m sure Antti, Alexey, and others who have worked on the loader and other parts of WebKit are happy to write those tests or list the kind of things they want to test. Heck, I don’t mind writing those tests if someone could make a list. I totally sympathize with the sentiment to reduce the test flakiness but loader and cache code have historically been under-tested, and we’ve had a number of bugs detected only by running non-loader tests consecutively. On the contrary, we’ve had this DRT behavior for ages. Is there any reason we can’t wait for another couple of weeks or months until we add more loader cache tests before making the behavior change? I think the nature of loader and cache code is that it's very hard to make tests which always fail deterministically when regressions are introduced, as opposed to randomly. The reason for this is that bugs in these areas are often timing-dependent. I think it's likely this tendency to fail randomly will be the case whether or not the tests are trying to explicitly test the cache or are just incidentally doing so in the course of other things. Unfortunately, it's very tempting when a test is failing randomly to blame the test rather than to investigate whether there is an actual regression affecting it. And sometimes it really is the test's fault. But sometimes it is a genuine bug in the code. On the other hand, nondetermisitic test failures make it harder to use test infrastructure in general. These are difficult things to reconcile. The original philosophy of WebKit tests is to test end-to-end under relatively realistic conditions, but at the same time unpredictability makes it hard to stay at zero regressions. I think making different ports do testing under different conditions makes it more likely that some contributors will introduce regressions without noticing, leaving it for others to clean up. So it's regrettable if we go that way because we are unable to reach consensus. Creating some special opt-in --antti mode would be even worse, as it's almost certain that failures would creep into a mode that nobody runs. What I personally would most wish for is good tools to catch when a test starts failing nondeterministically, and to identify the revision where the failures began. The reason we hate random failures is that they are hard to track down and diagnose. But some types of bugs are unlikely to manifest in a purely deterministic way. It would be good if we had a reliable and useful way to catch those types of bugs. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
We could clear the cache between tests but run each test twice in a row. Second run will then happen with deterministically pre-populated cache. That would both make things more predictable and improve our test coverage for cached cases. Unfortunately it would also slow down testing significantly, though less than 2x. antti ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
We can live in one of two worlds: 1) LayoutTests that concern themselves with specific network/loading concerns need to use unique URLs to refer to static data; or 2) DRT clears JS-visible state between tests. The pros/cons seem clear to me: Pro#1: loading/caching code is coincidentally tested by (unknown) tests that reuse URLs among themselves. Con#1: requires additional cognitive load for all webkit developers; the only way to write a test that won't be affected by future addition of unrelated tests is to use unique URLs Pro#2: principle of least-surprise is maintained; understanding DRT reading a test (and not every other test) is enough to understand its behavior Con#2: loading/caching code needs to be tested explicitly. IMO (Pro#2 + -Con#1) (Pro#1 + -Con#2). Are you saying you believe the inequality goes a different way, or am I missing some other feature of your thesis? Yes, this is a fair description. I'm going to assume you mean that yes, you believe the inequality goes the other way: (Pro#2 + -Con#1) (Pro#1 + -Con#2) This accidental testing is not something to be neglected I'm not neglecting it, I'm evaluating its benefit to be less than its cost. To make concrete the cost/benefit tradeoff, would you add a random sleep() into DRT execution to detect timing-related bugs? It seems like a crazy thing to do, to me, but it would certainly catch timing-related bugs quite effectively. If you don't think we should do that, can you describe how you're evaluating cost/benefit in each of the cases and why you arrive at different conclusions? (of course, adding such random sleeps under default-disabled flag control for bug investigation could make a lot of sense; but here I'm talking about what we do on the bots by default) It's not humanly possible to have tests for everything in advance. Of course. But we should at least make it humanly possible to understand our tests as written :) Making understanding our tests not humanly possible isn't the way to make up for the not-humanly-possible nature of testing everything in every way. It just means we push off not knowing how much coverage we really have, and derive a false sense of security from the fact that bugs have been found in the past. I completely agree with Maciej's idea that we should think about ways to make non-deterministic failures easier to work with, so that they would lead to discovering the root cause more directly, and without the costs currently associated with it. I have no problem with that, but I'm not sure how it relates to this thread unless one takes an XOR approach, in which case I guess I have low faith that the bigger problem Maciej highlights will be solved in a reasonable timeframe (weeks/months). Memory allocator state. Computer's real time clock. Hard drive's head position if you have a spinning hard drive, or SSD controller state if you have an SSD. HTTP cookies. Should I continue the list? These things are all outside of webkit. Yes, they are outside WebKit, but not outside WebKit control, if needed. Did you intend that to be an objection? I imagine Balazs was pointing out that you included items that are not JS-visible in an answer to my question about things that are JS-visible. But that was part of an earlier fork of this thread that went nowhere, so let's let it go. Cheers, -a ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] On returning mutable pointers from const methods
On Sun, Oct 28, 2012 at 6:12 AM, Maciej Stachowiak m...@apple.com wrote: I am not sure a blanket rule is correct. If the Foo* is logically related to the object with the foo() method and effectively would give access to mutable internal state, then what you say is definitely correct. But if the const object is a mere container that happens to hold pointers, there's no particular reason it should turn them into const pointers. For example, it is fine for const methods of HashSet or Vector to return non-const pointers if that happens to be the template parameter type. In such cases, constness of the container should only prevent you from mutating the container itself, not from mutating anything held in the container, or else const containers of non-const pointers (or non-const types in general) would be useless. IMO const containers that vend non-const pointers _are_ nearly useless. I consider logical constness to include not only this statement has no observable side effect but also this statement does not allow me to issue a subsequent statement with observable side effects. A const Vector that allows to to obtain a non-const element pointer, which you subsequently mutate, means that you can mutate the vector. (Consider for example an element destructor that removes the element from the vector.) If you allow const methods to return non-const pointers, it's almost always possible to construct some chain of events that leads to mutation of the const object or of state it can observe. It is far easier to simply make simple rules like const methods shall not vend non-const pointers than try to allow it but guarantee that all the code everyone writes is safe. PK ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote: I’m sure Antti, Alexey, and others who have worked on the loader and other parts of WebKit are happy to write those tests or list the kind of things they want to test. Heck, I don’t mind writing those tests if someone could make a list. I totally sympathize with the sentiment to reduce the test flakiness but loader and cache code have historically been under-tested, and we’ve had a number of bugs detected only by running non-loader tests consecutively. On the contrary, we’ve had this DRT behavior for ages. Is there any reason we can’t wait for another couple of weeks or months until we add more loader cache tests before making the behavior change? Please correct me if I'm misinformed, but it's been three months since this issue was first raised, and it doesn't sound like they've been writing those tests or are happy to do so, and despite people asking on this thread, they haven't been listing the kinds of tests they think they need. Have we actually made any progress here, or was the issue dropped until Ami raised it again? It seems like the latter to me ... again, please correct me if this is being actively worked on, because that would change the whole tenor of this debate. On Sun, Oct 28, 2012 at 6:32 AM, Maciej Stachowiak m...@apple.com wrote: I think the nature of loader and cache code is that it's very hard to make tests which always fail deterministically when regressions are introduced, as opposed to randomly. The reason for this is that bugs in these areas are often timing-dependent. I think it's likely this tendency to fail randomly will be the case whether or not the tests are trying to explicitly test the cache or are just incidentally doing so in the course of other things. I am not familiar with the loader and caching code in webkit, but I know enough about similar problem spaces to be puzzled by why it's impossible to write tests that can adequately test the code. Is the caching disk-based, and maybe running tests in parallel screwing with things? If so, then maybe the fact that we now run tests in parallel is why this is a problem now and hasn't been before? Or maybe the fact that a given process doesn't always see the same tests in the same order is the problem? Unfortunately, it's very tempting when a test is failing randomly to blame the test rather than to investigate whether there is an actual regression affecting it. And sometimes it really is the test's fault. But sometimes it is a genuine bug in the code. On the other hand, nondetermisitic test failures make it harder to use test infrastructure in general. These are difficult things to reconcile. The original philosophy of WebKit tests is to test end-to-end under relatively realistic conditions, but at the same time unpredictability makes it hard to stay at zero regressions. Exactly. Personally, the cost of unpredictability in the test infrastructure is so much higher than the value we're getting (implicitly) that this is a no-brainer to me. There are some tradeoffs (like running tests in parallel) that are worth it, but this isn't one of them. I am happy to explain further my thinking and standards if there's interest. Hopefully that partially answers Alexey's questions about where we should draw the line in trying to make our tests deterministic and hermetic: do everything you reasonably can. We're not picking on caching here. I think making different ports do testing under different conditions makes it more likely that some contributors will introduce regressions without noticing, leaving it for others to clean up. So it's regrettable if we go that way because we are unable to reach consensus. I agree that it is bad to have different ports behaving differently, and I would like to avoid that as well. I don't want any port suffering from flaky tests, but I also don't think it's reasonable to have one group force that on everyone else indefinitely, either. I am also fine with having some way to test systems more non-deterministically in a way to expose more bugs, but that needs to be clearly separated from the other testing we do; it is an unfair cost to impose on the rest of the system otherwise and should be tolerated only if we have no other choice. We have other choices. Creating some special opt-in --antti mode would be even worse, as it's almost certain that failures would creep into a mode that nobody runs. This comment (and Antti's suggestion, below) makes me think that you didn't understand my virtual test suite suggestion; that's not surprising, since Apple doesn't actual use this feature of NRWT yet. A virtual test suite is a way of saying (re-)run the tests under directory X with command-line flags Y and Z, and put the results in a new directory. For example, Chromium runs all of the tests in fast/canvas twice, once normally using the regular software code path, and once with a
Re: [webkit-dev] On returning mutable pointers from const methods
It is useful to say that getting a pointer to T from an object of type S does not change S's state. -F On Oct 28, 2012, at 2:09 PM, Peter Kasting pkast...@chromium.org wrote: On Sun, Oct 28, 2012 at 6:12 AM, Maciej Stachowiak m...@apple.com wrote: I am not sure a blanket rule is correct. If the Foo* is logically related to the object with the foo() method and effectively would give access to mutable internal state, then what you say is definitely correct. But if the const object is a mere container that happens to hold pointers, there's no particular reason it should turn them into const pointers. For example, it is fine for const methods of HashSet or Vector to return non-const pointers if that happens to be the template parameter type. In such cases, constness of the container should only prevent you from mutating the container itself, not from mutating anything held in the container, or else const containers of non-const pointers (or non-const types in general) would be useless. IMO const containers that vend non-const pointers _are_ nearly useless. I consider logical constness to include not only this statement has no observable side effect but also this statement does not allow me to issue a subsequent statement with observable side effects. A const Vector that allows to to obtain a non-const element pointer, which you subsequently mutate, means that you can mutate the vector. (Consider for example an element destructor that removes the element from the vector.) If you allow const methods to return non-const pointers, it's almost always possible to construct some chain of events that leads to mutation of the const object or of state it can observe. It is far easier to simply make simple rules like const methods shall not vend non-const pointers than try to allow it but guarantee that all the code everyone writes is safe. PK ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
On 10/28/2012 08:25 PM, Ami Fischman wrote: We can live in one of two worlds: 1) LayoutTests that concern themselves with specific network/loading concerns need to use unique URLs to refer to static data; or 2) DRT clears JS-visible state between tests. The pros/cons seem clear to me: Pro#1: loading/caching code is coincidentally tested by (unknown) tests that reuse URLs among themselves. Con#1: requires additional cognitive load for all webkit developers; the only way to write a test that won't be affected by future addition of unrelated tests is to use unique URLs Pro#2: principle of least-surprise is maintained; understanding DRT reading a test (and not every other test) is enough to understand its behavior Con#2: loading/caching code needs to be tested explicitly. IMO (Pro#2 + -Con#1) (Pro#1 + -Con#2). Are you saying you believe the inequality goes a different way, or am I missing some other feature of your thesis? Yes, this is a fair description. I'm going to assume you mean that yes, you believe the inequality goes the other way: (Pro#2 + -Con#1) (Pro#1 + -Con#2) This accidental testing is not something to be neglected I'm not neglecting it, I'm evaluating its benefit to be less than its cost. To make concrete the cost/benefit tradeoff, would you add a random sleep() into DRT execution to detect timing-related bugs? It seems like a crazy thing to do, to me, but it would certainly catch timing-related bugs quite effectively. If you don't think we should do that, can you describe how you're evaluating cost/benefit in each of the cases and why you arrive at different conclusions? (of course, adding such random sleeps under default-disabled flag control for bug investigation could make a lot of sense; but here I'm talking about what we do on the bots by default) It's not humanly possible to have tests for everything in advance. Of course. But we should at least make it humanly possible to understand our tests as written :) Making understanding our tests not humanly possible isn't the way to make up for the not-humanly-possible nature of testing everything in every way. It just means we push off not knowing how much coverage we really have, and derive a false sense of security from the fact that bugs have been found in the past. I completely agree with Maciej's idea that we should think about ways to make non-deterministic failures easier to work with, so that they would lead to discovering the root cause more directly, and without the costs currently associated with it. I have no problem with that, but I'm not sure how it relates to this thread unless one takes an XOR approach, in which case I guess I have low faith that the bigger problem Maciej highlights will be solved in a reasonable timeframe (weeks/months). Memory allocator state. Computer's real time clock. Hard drive's head position if you have a spinning hard drive, or SSD controller state if you have an SSD. HTTP cookies. Should I continue the list? These things are all outside of webkit. Yes, they are outside WebKit, but not outside WebKit control, if needed. Did you intend that to be an objection? I imagine Balazs was pointing out that you included items that are not JS-visible in an answer to my question about things that are JS-visible. But that was part of an earlier fork of this thread that went nowhere, so let's let it go. I was just meaning that it is not feasible to force every external dependency to reset it's state, neither we want it. We just trust in them. But the cache is in WebKit, and we can reset it's state. So either resetting the cache is a good or a bad idea, I think it has nothing to do with the fact that we cannot reset the OS and the hardware (and external libs of course). ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
[webkit-dev] Some stderr output missing when using run-webkit-tests
Hi webkit-dev, When I include fprintf(stderr, ...) statements in WebKit code that I expect to be executed when running a set of layout tests, the summary page of run-webkit-tests will sometimes only show a subset of these statements. However, when I add a CRASH() somewhere in the code, the missing stderr output will appear on the summary page. Has anyone else experienced this issue? Is there a way to force run-webkit-tests to display all stderr output without needing to force a crash at a particular point in the code? Terry ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
On Sun, Oct 28, 2012 at 2:47 PM, Ryosuke Niwa rn...@webkit.org wrote: On Sun, Oct 28, 2012 at 2:09 PM, Dirk Pranke dpra...@chromium.org wrote: On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote: I’m sure Antti, Alexey, and others who have worked on the loader and other parts of WebKit are happy to write those tests or list the kind of things they want to test. Heck, I don’t mind writing those tests if someone could make a list. I totally sympathize with the sentiment to reduce the test flakiness but loader and cache code have historically been under-tested, and we’ve had a number of bugs detected only by running non-loader tests consecutively. On the contrary, we’ve had this DRT behavior for ages. Is there any reason we can’t wait for another couple of weeks or months until we add more loader cache tests before making the behavior change? Please correct me if I'm misinformed, but it's been three months since this issue was first raised, and it doesn't sound like they've been writing those tests or are happy to do so, and despite people asking on this thread, they haven't been listing the kinds of tests they think they need. I don't think anyone else had suggested adding tests as an option or set a deadline until I suggested yesterday (or when I did in my original reply to the thread). In fact, since Ami posted his reply on October 26th 1:20AM (PST), many contributors from non-PST timezones haven't even had a chance to read his post during normal business hours. Given that I'd think it's totally unreasonable to land the patch as is without giving people reasonable amount of time (~one week) to respond to this thread. Both you and Eric U suggesting adding new tests for this in the original thread on 8/9; in fact, this whole issue got a fair amount of discussion then, and this round hasn't really added anything new. I'm happy to wait a little longer if others want to come up with some other suggestions; I apologize if my previous response sounded like I was throwing down a gauntlet or otherwise not open to ideas; that was definitely not my intent. Rather, I was attempting to say that unless someone else has other ideas, the right path forward seems fairly clear to me and that I intended to proceed down it. Does that seem more reasonable to you? -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] Some stderr output missing when using run-webkit-tests
On 10/29/2012 12:29 AM, Terry Anderson wrote: Hi webkit-dev, When I include fprintf(stderr, ...) statements in WebKit code that I expect to be executed when running a set of layout tests, the summary page of run-webkit-tests will sometimes only show a subset of these statements. However, when I add a CRASH() somewhere in the code, the missing stderr output will appear on the summary page. Has anyone else experienced this issue? Is there a way to force run-webkit-tests to display all stderr output without needing to force a crash at a particular point in the code? Terry ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev The summary page only shows the stderr output of the failed tests. I also annoyed me sometimes, probably this should be changed. kbalazs ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] DRT/WTR should clear the cache at the beginning of each test?
On Sun, Oct 28, 2012 at 4:37 PM, Dirk Pranke dpra...@chromium.org wrote: On Sun, Oct 28, 2012 at 2:47 PM, Ryosuke Niwa rn...@webkit.org wrote: On Sun, Oct 28, 2012 at 2:09 PM, Dirk Pranke dpra...@chromium.org wrote: On Oct 26, 2012, at 11:11 PM, Ryosuke Niwa rn...@webkit.org wrote: I’m sure Antti, Alexey, and others who have worked on the loader and other parts of WebKit are happy to write those tests or list the kind of things they want to test. Heck, I don’t mind writing those tests if someone could make a list. I totally sympathize with the sentiment to reduce the test flakiness but loader and cache code have historically been under-tested, and we’ve had a number of bugs detected only by running non-loader tests consecutively. On the contrary, we’ve had this DRT behavior for ages. Is there any reason we can’t wait for another couple of weeks or months until we add more loader cache tests before making the behavior change? Please correct me if I'm misinformed, but it's been three months since this issue was first raised, and it doesn't sound like they've been writing those tests or are happy to do so, and despite people asking on this thread, they haven't been listing the kinds of tests they think they need. I don't think anyone else had suggested adding tests as an option or set a deadline until I suggested yesterday (or when I did in my original reply to the thread). In fact, since Ami posted his reply on October 26th 1:20AM (PST), many contributors from non-PST timezones haven't even had a chance to read his post during normal business hours. Given that I'd think it's totally unreasonable to land the patch as is without giving people reasonable amount of time (~one week) to respond to this thread. Both you and Eric U suggesting adding new tests for this in the original thread on 8/9; in fact, this whole issue got a fair amount of discussion then, and this round hasn't really added anything new. Yeah, but I don't think it got much traction back then. Also, we didn't have any deadlines like weeks or months. I'm happy to wait a little longer if others want to come up with some other suggestions; I apologize if my previous response sounded like I was throwing down a gauntlet or otherwise not open to ideas; that was definitely not my intent. Rather, I was attempting to say that unless someone else has other ideas, the right path forward seems fairly clear to me and that I intended to proceed down it. Does that seem more reasonable to you? Yes. - Ryosuke ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo/webkit-dev