Hi!
So how is the regression issue coming along?
Glad you asked!
Earlier in the week Jay found that the bitset patch gave us
significant regression. Since then we have been able to find issues
with it, but no conclusion on how to solve it via a new design/
container/etc.
So what are we doing?
Earlier this week I refactored the interface to the Table objects to
create, well, an interface!
It is not perfect but it encapsulated a large chunk of the code. Monty
today put back into the original MY_BITMAP but did so behind this
interface. Jay has run the numbers and declared that the regression
can no longer be found.
What will happen from here?
We are going to work on the interface some more. Basically make it so
that we can change the back end to the bitmap without changing a lot
of code.
Right now we have a couple of ideas on how to solve the problem (I
favor a bool in Field object, Monty wants to look at vector<bool>,
Mats has a bitvector). We will test each of these and find a solution
that gives us in the end better code with no regression.
Solving this issue we had to look at a number of things. Our methods,
the outcome of a tree rollback, and if performance in this case
mattered. The problem wasn't simple, and all solutions had draw backs.
The main thing we were not going to do was push some code which caused
regression that we would then "find a solution for in the future".
That was not acceptable. Rolling back the tree? We could do this, I
favored it if we had no other solution, but we determined that we
could patch up the tree without causing this sort of disruption.
And?
Encapsulating the interface gives us room to find a new solution.
So what came out of all of this?
We are moving to a staging tree.
As of now we have an lp:drizzle/staging tree. This tree should not be
pulled from. We will send code here before it is sent to trunk. If
code fails the performance regression testing then we will pull it back.
So what does this mean? No more pushes from main that bypass staging.
We have pointed the automatic regression testing at this tree. I am
going to be suggesting that we point Hudson and Buildbot at it as
well. If a tree can pass here then it will be moved to the main tree.
And what are the thoughts on regression for the future?
Jay asked me today "what do we mean by regression?". To me we can
calculate regression pretty easily. We look at the standard deviation
of all previous runs and apply it to the current tree. If we find that
we are within norms then the new code is fine (and I suspect we will
refine this formula in the future). This was my suggestion.
But what if regression happens and there is an argument for letting it
happen?
Then we talk about it on the mailing list. Right now most of us have
seen the numbers showing that 5.4 is faster then Drizzle at 16
concurrent connections. We have been looking into this, but we may
find that some of the decisions that let us scale out to more
connections/processors contributed to this. That is ok. Our target is
not the 16 connections sites, it is the sites that need mass numbers
of connections/threads/processors. If we find a change that hurts us
at 1-N and N is a small number that may be ok.
What will we do when we are confronted by this? We talk abut it on IRC
and we will send the information to the mailing list. More eyeballs is
a good thing.
Thanks to everyone who has been working on this!
-Brian
_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help : https://help.launchpad.net/ListHelp