Re: [sqlite] Reporting two more interesting queries with performance regression
Hello, We are also trying to automatically narrow down the root cause of the performance regression by using traditional statistical debugging technique. (e.g., http://people.cs.uchicago.edu/~shanlu/preprint/oopsla161-song.pdf) From the process, statistical debugger returns file/function name which is closely related with the problem. To support your analysis, we would like to share the result that we acquired from the debugger. We do analysis on queries from our second/third report: http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083868.html http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083970.html 1) 6.sql (second report) - bisect commit: 57eb2abd5b - stat-dbg result: + file: select.c + function: sqlite3Select ==> especially places near /* Do the constant propagation optimization */ causes slow down 2) 19.sql (second report) - bisect commit: 7d9072b027 - stat-dbg result: + file: expr.c + function: codeApplyAffinity ==> removal of this function introduces slow down 3) 10002.sql (third report) - bisect commit: 9fb646f29c - stat-dbg result: + file: expr.c + function: sqlite3VdbeJumpHere, sqlite3VdbeAddOp0 4) 10052.sql (third report) - bisect commit: 7153552bac - stat-dbg result: + file: expr.c + function: sqlite3VdbeJumpHere, sqlite3VdbeAddOp0 Hope this help for your further analysis. Best regards, Jinho Jung On Sat, Apr 6, 2019 at 9:47 AM Jinho Jung wrote: > > Hello, > > We are submitting the third report with two more interesting cases with > bisecting result using "SQL-Perf-Fuzzer". > > This is the timeline of our activities: > > [Apr 1, 2019] Submitted 1st report (3 queries, 1 bisected commits) > ( > http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083863.html > ) > ==> confirmed the problem with correct bisect result (Apr/1) > > [Apr 2, 2019] Submitted 2nd report: 2 queries, 2 unique bisected commits > ( > http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083868.html > ) > ==> not yet confirmed > > [Apr 5, 2019] Submitted 3rd report (2 queries, 2 unique bisected commits) > ==> this report > > > Here are the steps for reproducing our observations. All steps are same > except for the link to download new test-cases: > > [Our test environment] > * Ubuntu 16.04 machine "Linux sludge 4.4.0-116-generic #140-Ubuntu SMP Mon > Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux" > * Database: TPC-C benchmark > > [Setup Test Environment] > > 1. build SQLite 3.27.2 (verion of Feb 2019) > $ wget https://www.sqlite.org/2019/sqlite-src-3270200.zip > $ unzip sqlite-src-3270200.zip > $ mv sqlite-src-3270200 sqlite327 > $ cd sqlite327 > $ ./configure > $ make > $ cd .. > > 2. build SQLite 3.23.0 (verion of Apr 2018) > $ wget https://www.sqlite.org/2018/sqlite-src-323.zip > $ unzip sqlite-src-323.zip > $ mv sqlite-src-323 sqlite323 > $ cd sqlite323 > $ ./configure > $ make > $ cd .. > > 3. download tpc-c for sqlite3 (scale-factor of 1) > $ mkdir testcase > $ cd testcase > > $ wget https://gts3.org/~/jjung/sqlite/tpcc_sqlite.tar.gz > $ tar xzvf tpcc_sqlite.tar.gz > > ; download regression queries > $ wget https://gts3.org/~/jjung/sqlite/report3.tar.gz > $ tar xzvf report1.tar.gz > $ cd .. > > 4. launch two SQLites > - start > $ sqlite327/sqlite3 testcase/test.db > $ sqlite323/sqlite3 testcase/test.db > > - for each DB, set up timer > sqlite> .timer on > > - copy and paste extracted queries > > > Here’s the time taken to execute four SQL queries on old (v3.23) and newer > version (v3.27.2) of SQLite (in milliseconds). We also try bisecting to > know which commit activate the regression. > > +--++ > | Query| Time | > +--++ > | 10002.sql (v3.23)|789 | > | 10002.sql (v3.27.2) | 1270 | > +--++ > | 10052.sql (v3.23)| 3094 | > | 10052.sql (v3.27.2) | 4478 | > +--++ > > 1) 10002.sql shows 60% performance regression > - bisect fossil commit: > === 2018-12-31 === > [9fb646f29c] *MERGE* Merge enhancements and bug fixes from trunk. (user: > drh tags: reuse- >subqueries) > > 2) 10052.sql shows 40% performance regression > - bisect fossil commit: > === 2018-12-24 === > [7153552bac] Improvements to EXPLAIN QUERY PLAN formatting. The > MULTI-INDEX OR now >shows a separate "INDEX" subtree for each index. SCALAR SUBQUERY > en
[sqlite] Reporting two more interesting queries with performance regression
Hello, We are submitting the third report with two more interesting cases with bisecting result using "SQL-Perf-Fuzzer". This is the timeline of our activities: [Apr 1, 2019] Submitted 1st report (3 queries, 1 bisected commits) ( http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083863.html ) ==> confirmed the problem with correct bisect result (Apr/1) [Apr 2, 2019] Submitted 2nd report: 2 queries, 2 unique bisected commits ( http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/2019-April/083868.html ) ==> not yet confirmed [Apr 5, 2019] Submitted 3rd report (2 queries, 2 unique bisected commits) ==> this report Here are the steps for reproducing our observations. All steps are same except for the link to download new test-cases: [Our test environment] * Ubuntu 16.04 machine "Linux sludge 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux" * Database: TPC-C benchmark [Setup Test Environment] 1. build SQLite 3.27.2 (verion of Feb 2019) $ wget https://www.sqlite.org/2019/sqlite-src-3270200.zip $ unzip sqlite-src-3270200.zip $ mv sqlite-src-3270200 sqlite327 $ cd sqlite327 $ ./configure $ make $ cd .. 2. build SQLite 3.23.0 (verion of Apr 2018) $ wget https://www.sqlite.org/2018/sqlite-src-323.zip $ unzip sqlite-src-323.zip $ mv sqlite-src-323 sqlite323 $ cd sqlite323 $ ./configure $ make $ cd .. 3. download tpc-c for sqlite3 (scale-factor of 1) $ mkdir testcase $ cd testcase $ wget https://gts3.org/~/jjung/sqlite/tpcc_sqlite.tar.gz $ tar xzvf tpcc_sqlite.tar.gz ; download regression queries $ wget https://gts3.org/~/jjung/sqlite/report3.tar.gz $ tar xzvf report1.tar.gz $ cd .. 4. launch two SQLites - start $ sqlite327/sqlite3 testcase/test.db $ sqlite323/sqlite3 testcase/test.db - for each DB, set up timer sqlite> .timer on - copy and paste extracted queries Here’s the time taken to execute four SQL queries on old (v3.23) and newer version (v3.27.2) of SQLite (in milliseconds). We also try bisecting to know which commit activate the regression. +--++ | Query| Time | +--++ | 10002.sql (v3.23)|789 | | 10002.sql (v3.27.2) | 1270 | +--++ | 10052.sql (v3.23)| 3094 | | 10052.sql (v3.27.2) | 4478 | +--++ 1) 10002.sql shows 60% performance regression - bisect fossil commit: === 2018-12-31 === [9fb646f29c] *MERGE* Merge enhancements and bug fixes from trunk. (user: drh tags: reuse- subqueries) 2) 10052.sql shows 40% performance regression - bisect fossil commit: === 2018-12-24 === [7153552bac] Improvements to EXPLAIN QUERY PLAN formatting. The MULTI-INDEX OR now shows a separate "INDEX" subtree for each index. SCALAR SUBQUERY entries provide a subquery number that is related back to the .selecttrace output. (user: drh tags: reuse- subqueries) Thanks for your checking and support. Best regards, Jinho Jung ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Reporting two queries with performance regression
Hello, Thanks for the previous advice. We are reporting two interesting cases with different bisecting result using "sql-perf-fuzz". Here are the steps for reproducing our observations. All steps are same except for the link for downloading new test-cases: [Our test environment] * Ubuntu 16.04 machine "Linux sludge 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux" * Database: TPC-C benchmark [Setup Test Environment] 1. build SQLite 3.27.2 (verion of Feb 2019) $ wget https://www.sqlite.org/2019/sqlite-src-3270200.zip $ unzip sqlite-src-3270200.zip $ mv sqlite-src-3270200 sqlite327 $ cd sqlite327 $ ./configure $ make $ cd .. 2. build SQLite 3.23.0 (verion of Apr 2018) $ wget https://www.sqlite.org/2018/sqlite-src-323.zip $ unzip sqlite-src-323.zip $ mv sqlite-src-323 sqlite323 $ cd sqlite323 $ ./configure $ make $ cd .. 3. download tpc-c for sqlite3 (scale-factor of 1) $ mkdir testcase $ cd testcase $ wget https://gts3.org/~/jjung/sqlite/tpcc_sqlite.tar.gz $ tar xzvf tpcc_sqlite.tar.gz ; download regression queries $ wget https://gts3.org/~/jjung/sqlite/report2.tar.gz $ tar xzvf report1.tar.gz $ cd .. 4. launch two SQLites - start $ sqlite327/sqlite3 testcase/test.db $ sqlite323/sqlite3 testcase/test.db - for each DB, set up timer sqlite> .timer on - copy and paste extracted queries Here’s the time taken to execute four SQL queries on old (v3.23) and newer version (v3.27.2) of SQLite (in milliseconds). We also try bisecting to know which commit activate the regression. +--++ | Query| Time | +--++ | 6.sql (v3.23)| 3 | | 6.sql (v3.27.2) |156 | +--++ | 19.sql (v3.23) |720 | | 19.sql (v3.27.2) | 1747 | +--++ 1) 6.sql shows x52 slow query execution. - bisect fossil commit: === 2018-07-26 === [57eb2abd5b] Generalize the constant propagation optimization so that it applies on every WHERE close, not just those that contain a subquery. This then demonstrates that the current implementation is inadequate since it does not take into account collating sequences. (user: drh tags: propagate-const-opt) 2) 19.sql shows x2.5 slow query execution - bisect fossil commit: === 2018-08-04 === [7d9072b027] Further logic simplifications that flow out of the omission of the column cache. (user: drh tags: omit-column-cache) Thanks for your support. Best regards, Jinho Jung ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Need advice: Some queries show performance regression
Thanks for the quick response. 1) It seems our bisecting is wrong. We manually moved commit by commit by using "fossil checkout" command. Next time, we will use "fossil bisect" command. 2) Sorry for the misleading. Our first report contains three queries. We will submit another cases with correct bisect later. Jinho Jung On Mon, Apr 1, 2019 at 1:32 PM Richard Hipp wrote: > On 4/1/19, Jinho Jung wrote: > > Hello, > > > > We are developing a tool called sqlfuzz for automatically finding > > performance regressions in SQLite. sqlfuzz performs mutational fuzzing to > > generate SQL queries that take more time to execute on the latest version > > of SQLite compared to prior versions. We hope that these queries would > help > > further increase the utility of the regression test suite. > > Thanks for the report. > > Since there are already a bazillion fuzzers for SQLite, may I suggest > that you choose a more a more specific and descriptive name for your > fuzzer? Perhaps "sql-perf-fuzz" or something similar - so that we > know that your fuzzer is targeting performance regressions? > > > > > We are sharing four SQL queries that exhibit regressions in this report. > > Here’s an illustrative query: > > I only got 3 SQL queries. What am I missing? > > Also, I got bisect results for all three problem that are different > from the results you report. When I run bisect, I get the same result > for all three test cases: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__sqlite.org_src_timeline-3Fbid-3Dy736b53f57fnbd49a8271dycb1511065dy6c6fb1c6eay30f08d5888y507c43537fy2c8769c69fn4371a0c46en4d0a949fd9n79c073878dy93386a7c97nd840e9bb02=DwIFaQ=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI=Eb5d36HrPu-wBrtI5PLOoA=bTwdIvNq94lBWxZ0Jw0Hony2eN54rEUuJrZ2CNiSxYg=W8KSOERSMGlpnn6BItMu8zuy4oCZlGBlFUDrZGVwBzw= > > So it was apparently a bug-fix that caused the performance decrease. > I have not looked into the details yet. Perhaps there is an > alternative fix for the bug that does not cause unnecessary > performance loss. > > -- > D. Richard Hipp > d...@sqlite.org > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Need advice: Some queries show performance regression
Hello, We are developing a tool called sqlfuzz for automatically finding performance regressions in SQLite. sqlfuzz performs mutational fuzzing to generate SQL queries that take more time to execute on the latest version of SQLite compared to prior versions. We hope that these queries would help further increase the utility of the regression test suite. We are sharing four SQL queries that exhibit regressions in this report. Here’s an illustrative query: EXAMPLE: select ref_0.NO_O_ID from main.NEW_ORDER as ref_0 where ref_0.NO_W_ID is not NULL and (EXISTS ( select 1 from ( select 1 from main.STOCK as ref_2 left join main.CUSTOMER as ref_3 on (ref_2.S_YTD = ref_3.C_ID) where ref_3.C_ZIP is not NULL) as subq_0 where (ref_0.NO_W_ID is not NULL) ) ); Time taken on SQLite v3.23.0: 73 (milliseconds) Time taken on SQLite v3.27.2: 4955 (milliseconds) Here are the steps for reproducing our observations: [Our test environment] * Ubuntu 16.04 machine "Linux sludge 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux" * Database: TPC-C benchmark [Setup Test Environment] 1. build SQLite 3.27.2 (verion of Feb 2019) $ wget https://www.sqlite.org/2019/sqlite-src-3270200.zip $ unzip sqlite-src-3270200.zip $ mv sqlite-src-3270200 sqlite327 $ cd sqlite327 $ ./configure $ make $ cd .. 2. build SQLite 3.23.0 (verion of Apr 2018) $ wget https://www.sqlite.org/2018/sqlite-src-323.zip $ unzip sqlite-src-323.zip $ mv sqlite-src-323 sqlite323 $ cd sqlite323 $ ./configure $ make $ cd .. 3. download tpc-c for sqlite3 (scale-factor of 1) $ mkdir testcase $ cd testcase $ wget https://gts3.org/~/jjung/sqlite/tpcc_sqlite.tar.gz $ tar xzvf tpcc_sqlite.tar.gz ; download regression queries $ wget https://gts3.org/~/jjung/sqlite/report1.tar.gz $ tar xzvf report1.tar.gz $ cd .. 4. launch two SQLites - start $ sqlite327/sqlite3 testcase/test.db $ sqlite323/sqlite3 testcase/test.db - for each DB, set up timer sqlite> .timer on - copy and paste extracted queries Here’s the time taken to execute four SQL queries on old (v3.23) and newer version (v3.27.2) of SQLite (in milliseconds). We also try bisecting to know which commit activate the regression. +--++ | Query| Time | +--++ | 1.sql (v3.23)|148 | | 1.sql (v3.27.2) |314 | +--++ | 1-1.sql (v3.23) | 1 | | 1-1.sql (v3.27.2)| > 5min | +--++ | 338.sql (v3.23) | 73 | | 338.sql (v3.27.2)| 4955 | +--++ 1) 1.sql shows x2.12 slow query execution. - bisect fossil commit: 4978ee8b54 2) 1-1.sql is slight modification from 1.sql. However, we observed much slower execution speed. - bisect fossil commit: X (we couldn't bisect due to the long execution time) 3) 338.sql shows x67 show query execution. - bisect fossil commit: 4978ee8b54 We would greatly appreciate feedback from the community regarding these queries and are looking forward to improving the tool based on the community’s feedback. Thanks. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users