Bob, thanks for your reply
I wasn't implying we should try to explain anything away. All of these are valid concerns, I just wanted to get a better understanding on where the bit flips from +0 to -1 and subsequently, how to address that boundary. Ideally we can just fix all of the things you mention, but I think it is important to understand them in detail, that's why I was going into them. Ultimately, I want to understand what we need to do to ship 1.2.0. On Feb 26, 2012, at 21:22 , Bob Dionne wrote: > Jan, > > I'm -1 based on all of my evaluation. I've spent a few hours on this release > now yesterday and today. It doesn't really pass what I would call the "smoke > test". Almost everything I've run into has an explanation: > > 1. crashes out of the box - that's R15B, you need to recompile SSL and Erlang > (we'll note on release notes) Have we spent any time on figuring out what the trouble here is? > 2. etaps hang running make check. Known issue. Our etap code is out of date, > recent versions of etap don't even run their own unit tests I have seen the etap hang as well, and I wasn't diligent enough to report it in JIRA, I have done so now (COUCHDB-1424). > 3. Futon tests fail. Some are known bugs (attachment ranges in Chrome) . Both > Chrome and Safari also hang Do you have more details on where Chrome and Safari hang? Can you try their private browsing features, double/triple check that caches are empty? Can you get to a situation where you get all tests succeeding across all browsers, even if individual ones fail on one or two others? > 4. standalone JS tests fail. Again most of these run when run by themselves Which ones? > 5. performance. I used real production data *because* Stefan on user reported > performance degradation on his data set. Any numbers are meaningless for a > single test. I also ran scripts that BobN and Jason Smith posted that show a > difference between 1.1.x and 1.2.x You are conflating an IRC discussion we've had into this thread. The performance regression reported is a good reason to look into other scenarios where we can show slowdowns. But we need to understand what's happening. Just from looking at dev@ all I see is some handwaving about some reports some people have done (Not to discourage any work that has been done on IRC and user@, but for the sake of a release vote thread, this related information needs to be on this mailing list). As I said on IRC, I'm happy to get my hands dirty to understand the regression at hand. But we need to know where we'd draw a line and say this isn't acceptable for a 1.2.0. > 6. Reviewed patch pointed to by Jason that may be the cause but it's hard to > say without knowing the code analysis that went into the changes. You can see > obvious local optimizations that make good sense but those are often the ones > that get you, without knowing the call counts. That is a point that wasn't included in your previous mail. It's great that there is progress, thanks for looking into this! > Many of these issues can be explained away, but I think end users will be > less forgiving. I think we already struggle with view performance. I'm > interested to see how others evaluate this regression. > I'll try this seatoncouch tool you mention later to see if I can construct > some more definitive tests. Again, I'm not trying to explain anything away. I want to get a shared understanding of the issues you raised and where we stand on solving them squared against the ongoing 1.2.0 release. And again: Thanks for doing this thorough review and looking into performance issue. I hope with your help we can understand all these things a lot better very soon :) Cheers Jan -- > > Best, > > Bob > On Feb 26, 2012, at 2:29 PM, Jan Lehnardt wrote: > >> >> On Feb 26, 2012, at 13:58 , Bob Dionne wrote: >> >>> -1 >>> >>> R15B on OS X Lion >>> >>> I rebuilt OTP with an older SSL and that gets past all the crashes (thanks >>> Filipe). I still see hangs when running make check, though any particular >>> etap that hangs will run ok by itself. The Futon tests never run to >>> completion in Chrome without hanging and the standalone JS tests also have >>> fails. >> >> What part of this do you consider the -1? Can you try running the JS tests >> in Firefox and or Safari? Can you get all tests pass at least once across >> all browsers? The cli JS suite isn't supposed to work, so that isn't a >> criterion. I've seen the hang in make check for R15B while individual tests >> run as well, but I don't consider this blocking. While I understand and >> support the notion that tests shouldn't fail, period, we gotta work with >> what we have and master already has significant improvements. What would you >> like to see changed to not -1 this release? >> >>> I tested the performance of view indexing, using a modest 200K doc db with >>> a large complex view and there's a clear regression between 1.1.x and 1.2.x >>> Others report similar results >> >> What is a large complex view? The complexity of the map/reduce functions is >> rarely an indicator of performance, it's usually input doc size and >> output/emit()/reduce data size. How big are the docs in your test and how >> big is the returned data? I understand the changes for 1.2.x will improve >> larger-data scenarios more significantly. >> >> Cheers >> Jan >> -- >> >> >> >> >>> >>> On Feb 23, 2012, at 5:25 PM, Bob Dionne wrote: >>> >>>> sorry Noah, I'm in debug mode today so I don't care to start mucking with >>>> my stack, recompiling erlang, etc... >>>> >>>> I did try using that build repeatedly and it crashes all the time. I find >>>> it very odd and I had seen those before as I said on my older macbook. >>>> >>>> I do see the hangs Jan describes in the etaps, they have been there right >>>> along, so I'm confident this just the SSL issue. Why it only happens on >>>> the build is puzzling, any source build of any branch works just peachy. >>>> >>>> So I'd say I'm +1 based on my use of the 1.2.x branch but I'd like to hear >>>> from Stefan, who reported the severe performance regression. BobN seems to >>>> think we can ignore that, it's something flaky in that fellow's >>>> environment. I tend to agree but I'm conservative >>>> >>>> On Feb 23, 2012, at 1:23 PM, Noah Slater wrote: >>>> >>>>> Can someone convince me this bus error stuff and segfaults is not a >>>>> blocking issue. >>>>> >>>>> Bob tells me that he's followed the steps above and he's still >>>>> experiencing >>>>> the issues. >>>>> >>>>> Bob, you did follow the steps to install your own SSL right? >>>>> >>>>> On Thu, Feb 23, 2012 at 5:09 PM, Jan Lehnardt <j...@apache.org> wrote: >>>>> >>>>>> >>>>>> On Feb 23, 2012, at 00:28 , Noah Slater wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I would like call a vote for the Apache CouchDB 1.2.0 release, second >>>>>> round. >>>>>>> >>>>>>> We encourage the whole community to download and test these >>>>>>> release artifacts so that any critical issues can be resolved before the >>>>>>> release is made. Everyone is free to vote on this release, so get stuck >>>>>> in! >>>>>>> >>>>>>> We are voting on the following release artifacts: >>>>>>> >>>>>>> http://people.apache.org/~nslater/dist/1.2.0/ >>>>>>> >>>>>>> >>>>>>> These artifacts have been built from the following tree-ish in Git: >>>>>>> >>>>>>> 4cd60f3d1683a3445c3248f48ae064fb573db2a1 >>>>>>> >>>>>>> >>>>>>> Please follow the test procedure before voting: >>>>>>> >>>>>>> http://wiki.apache.org/couchdb/Test_procedure >>>>>>> >>>>>>> >>>>>>> Thank you. >>>>>>> >>>>>>> Happy voting, >>>>>> >>>>>> Signature and hashes check out. >>>>>> >>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.0, Erlang R14B04: make check >>>>>> works fine, browser tests in Safari work fine. >>>>>> >>>>>> Mac OS X 10.7.3, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check >>>>>> works fine, browser tests in Safari work fine. >>>>>> >>>>>> FreeBSD 9.0, 64bit, SpiderMonkey 1.7.0, Erlang R14B04: make check works >>>>>> fine, browser tests in Safari work fine. >>>>>> >>>>>> CentOS 6.2, 64bit, SpiderMonkey 1.8.5, Erlang R14B04: make check works >>>>>> fine, browser tests in Firefox work fine. >>>>>> >>>>>> Ubuntu 11.4, 64bit, SpiderMonkey 1.8.5, Erlang R14B02: make check works >>>>>> fine, browser tests in Firefox work fine. >>>>>> >>>>>> Ubuntu 10.4, 32bit, SpiderMonkey 1.8.0, Erlang R13B03: make check fails >>>>>> in >>>>>> - 076-file-compression.t: https://gist.github.com/1893373 >>>>>> - 220-compaction-daemon.t: https://gist.github.com/1893387 >>>>>> This on runs in a VM and is 32bit, so I don't know if there's anything in >>>>>> the tests that rely on 64bittyness or the R14B03. Filipe, I think you >>>>>> worked on both features, do you have an idea? >>>>>> >>>>>> I tried running it all through Erlang R15B on Mac OS X 1.7.3, but a good >>>>>> way into `make check` the tests would just stop and hang. The last time, >>>>>> repeatedly in 160-vhosts.t, but when run alone, that test finished in >>>>>> under >>>>>> five seconds. I'm not sure what the issue is here. >>>>>> >>>>>> Despite the things above, I'm happy to give this a +1 if we put a warning >>>>>> about R15B on the download page. >>>>>> >>>>>> Great work all! >>>>>> >>>>>> Cheers >>>>>> Jan >>>>>> -- >>>>>> >>>>>> >>>> >>> >> >