Hello everyone, Thank you Elena, I really appreciate that you took the time to make a thorough review of the code. Answers to your comments are inline. Conclusion at the end : )
1. Structure > - When I started coding, I had no idea how the project would fit in the whole system, so, after asking a bit, I wrote a script much like Sergei's example: Only with the simulations and statistics in mind and not considering the implementation phase. I tried to make it a bit more practical to use, but I did not have in mind using this code as part of the implementation. Only the lessons learned from it. - If you think it's a better idea, we can define exactly what the interfaces of my project should be, so that I can write something with views on the implementation, rather than a pure research-oriented script. - Regarding the multi-mode logic: I decided to keep the logic from all modes because it's only a constant overhead (it doesn't affect the complexity of the original algorithm significantly), and at first I was considering all failures, not only those from tests that ran. I can easily add the code to run only the mode of interest. 2. Cleanup > > if you call several simulations from run_basic_simulations.py, only the > very first one will use the correct data and get real results. All > consequent ones will use the data modified by the previous ones, and the > results will be totally irrelevant. > Yup, this is a huge mistake. When I implemented the series of simulations, I completely forgot about the metrics, and thought there was no extra cleanup necessary; but I was wrong. This is the first thing I'll fix. 3. Modes > > These flaws should be easy to fix by doing proper cleanup before each > simulation. But there are also other fragments of code where, for example, > logic for 'standard' mode is supposed to be always run and is relied upon, > even if the desired mode is different. > > In fact, you build all queues every time. It would be an understandable > trade-off to save the time on simulations, but you re-run them separately > anyway, and only return the requested queue. > I decided to keep the logic of all modes because it does not affect the complexity of the algorithm... and yes, I relied on the standard mode because it always runs. This is not a mistake, but rather it is a 'cheat' that I was aware of, but indulged in. This is easy to fix - and I don't think it is too serious of a problem. If I were to code something with views towards the implementation, I would make sure that only strictly necessary code would run. 4. Failed tests vs executed tests > - if we say we can afford running 500 tests, we'd rather run 500 than 20, > even if some of them never failed before. This will also help us break the > bubble, especially if we randomize the "tail" (tests with the minimal > priority that we add to fill the queue). If some of them fail, they'll get > a proper metric and will migrate to the meaningful part of the queue. > I don't understand what you mean by bubble, but I see your point. You are right. I will work on making sure that we run RUNNING_SET tests, even when the priority queue contains less than that. > To populate the queue, You don't really need the information which tests > had ever been run; you only need to know which ones MTR *wants* to run, if > the running set is unlimited. If we assume that it passes the list to you, > and you iterate through it, you can use your metrics for tests that failed > or were edited before, and a default minimal metric for other tests. Then, > if the calculated tests are not enough to fill the queue, you'll randomly > choose from the rest. It won't completely solve the problem of tests that > never failed and were never edited, but at least it will make it less > critical. > This brings us back to the question of research-only vs preparing-for-implementation, yes, this can be done. 6. Full / non-full simulation mode > > I couldn't understand what the *non*-full simulation mode is for, can you > explain this? > This is from the beginning of the implementation, when I was considering all failures - even from tests that were not supposed to run. I can remove this already. > > 7. Matching logic (get_test_file_change_history) > > The logic where you are trying to match result file names to test names is > not quite correct. There are some highlights: > > There can also be subsuites. Consider the example: > ./mysql-test/suite/engines/iuds/r/delete_decimal.result > > The result file can live not only in /r dir, but also in /t dir, together > with the test file. It's not cool, but it happens, see for example > mysql-test/suite/mtr/t/ > > Here are some other possible patterns for engine/plugin suites: > ./storage/tokudb/mysql-test/suite/tokudb/r/rows-32m-1.result > ./storage/innodb_plugin/mysql-test/innodb.result > Also, in release builds they can be in mysql-test/plugin folder: > mysql-test/plugin/example/mtr/t > When I coded that logic, I used the website that you had suggested, and thought that I had covered all possibilities. I see that I didn't. I will work on that too. > > Be aware that the logic where you compare branch names doesn't currently > work as expected. Your list of "fail branches" consists of clean names > only, e.g. "10.0", while row[BRANCH] can be like > "lp:~maria-captains/maria/10.0". I'm not sure yet why it is sometimes > stored this way, but it is. Yes, I am aware of this; but when I looked at the data, there were very, very few cases like this, so I decided to just ignore them. All the other cases (>27 thousand) are presented as clean names. I believe this is negligible, but if you think otherwise, I can add the logic to parse these few cases. MariaDB [kokiri_apr23]> select branch, count(*) from changes where branch like 'lp%' group by 1 order by 2; +------------------------------------------------------+----------+ | branch | count(*) | +------------------------------------------------------+----------+ | lp:~maria-captains/maria/5.3-knielsen | 1 | | lp:~maria-captains/maria/5.5 | 1 | | lp:~maria-captains/maria/mariadb-5.1-mwl132 | 1 | | lp:~maria-captains/maria/mariadb-5.1-mwl163 | 1 | | lp:~maria-captains/maria/maria-5.1-table-elimination | 3 | | lp:~maria-captains/maria/10.0-FusionIO | 7 | | lp:~maria-captains/maria/5.1-converting | 9 | | lp:~maria-captains/maria/10.0-connect | 15 | | lp:~maria-captains/maria/10.0 | 45 | +------------------------------------------------------+----------+ Lets get back to it (both of them) after the logic with dependent > simulations is fixed, after that we'll review it and see why it doesn't > work if it still doesn't. Right now any effect that file edition might have > is rather coincidental, possibly the other one is also broken. > Yup! > - Humans are probably a lot better at predicting first failures than >> the >> current strategy. >> > > This is true, unfortunately it's a full time job which we can't afford to > waste a human resource on. > I was not suggesting to hire a person for this : ). What I meant is that a developer would be a lot better at choosing how to test their own changes, rather than a script that just accounts for recent failures. This is information that a developer can easily leverage on his own, along with his code changes, etc... I was trying to say that considering all this, I need to improve the algorithm. In conclusion, this is what I will work on now: - Add cleanup for every run - I just added (and pushed) this. Right now I am running tests with this logic. - Get new results, graph and report them ASAP. - Add randomized running of tests that have not ran or been modified - I will make this optional. - Fix the logic that converts test result file names to test names Questions to consider: - Should we define precisely what the implementation should look like, so that I can code the core, wrapper, etc? Or should I focus on finishing the experiments and the 'research' first? - I believe the logic to compare branch names should be good enough, but if you consider that it should be 100% fixed, I can do that. What do you think? Regards Pablo
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

