Re: [sqlite] Virtual table API performance
On 05.03.2014 11:02, RSmith wrote: On 2014/03/05 10:41, Dominique Devienne wrote: On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakis One thing that IMHO long term might improve the situation would be if SQLite's own "native" tables would use the same Virtual Table API,//... ...//Of course, the above is a "naive" abstract reflection which ignores the realities of VDBE, so it may sound rubbish to actual SQLite developers. Apologies for that. --DD I don't think it is rubbish at all, but maybe idealistic. The biggest problem I can see from making API's pov is that you can at any time alter, update, change the way SQLIte (or any other API) works with the base check that the input values produce the same (or maybe more-correct) results. Once you let the VT use the same API, any change is a potential change to how other people's programmed interfaces need to talk to - or get data from - the SQLite engine. This cannot simply change on a whim, so the levels of separation remain needed. That said, I'm all for making a more efficient VT API, but it would probably need to be "new" functionality since I cannot see how the existing interface could implement any of the mentioned enhancements without breaking existing behaviour. The OP's xNextRow suggestion seems a good idea, but opens up a whole can of what-ifs which other posters have alluded to, but something to that effect might be worthwhile if the efficiency bonus is significant. The whole thread so far is based on the OP observations in mixed C/PyPy and apsw/CPython environments (as being said already, we suffering the noise in both cases). To be helpful to the SQLite team, before proposing any changes, please let someone show some well designed, pure C vtable implementation demonstrating the possible vtable interface inefficiency. Let's remember that all xNextRow, xNextPage optimizations are applicable only for "select *" cases, not in the general "select f(x), y" case. Kind regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On 2014/03/05 10:41, Dominique Devienne wrote: On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakis One thing that IMHO long term might improve the situation would be if SQLite's own "native" tables would use the same Virtual Table API,//... ...//Of course, the above is a "naive" abstract reflection which ignores the realities of VDBE, so it may sound rubbish to actual SQLite developers. Apologies for that. --DD I don't think it is rubbish at all, but maybe idealistic. The biggest problem I can see from making API's pov is that you can at any time alter, update, change the way SQLIte (or any other API) works with the base check that the input values produce the same (or maybe more-correct) results. Once you let the VT use the same API, any change is a potential change to how other people's programmed interfaces need to talk to - or get data from - the SQLite engine. This cannot simply change on a whim, so the levels of separation remain needed. That said, I'm all for making a more efficient VT API, but it would probably need to be "new" functionality since I cannot see how the existing interface could implement any of the mentioned enhancements without breaking existing behaviour. The OP's xNextRow suggestion seems a good idea, but opens up a whole can of what-ifs which other posters have alluded to, but something to that effect might be worthwhile if the efficiency bonus is significant. Cheers Ryan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakiswrote: > [...] Nevertheless, having people do crazy/clever hacks like that to avoid the > inefficiencies of an API, gives a hint that something might be problematic. > > In a previous email of mine, i had said half seriously/half joking about the > cost in Watts of SQLite's worldwide usage. For something that is so widely > used, even some % of efficiency improvement really makes a difference. It is > not an "SQLite destroys/saves civilization" kind of difference, but IMHO it > would be measurable in G/M Watts. The SQLite developers clearly care about performance, that much is clear. But they also care about the "lite"-ness aspects too (and so does part of its community, judging by this list). They try to find the right balance for them, with ease of implementation/testing/maintenance as the last variable of this equation. One thing that IMHO long term might improve the situation would be if SQLite's own "native" tables would use the same Virtual Table API, because then any bottleneck would apply to all "tables", not just the VT ones, and would be more likely to be removed. And as recently discussed, the VT API would be forced to acquire ways to communicate with Explain Query Plan (for example) to notice Full Scans or Covering Indexes not just for the "native" tables. VTs came after the "native" tables, so the current situation is normal, but if all table accesses, "native" or otherwise, could go thru the same API in the future (in SQLite3 or Sqlite4), it would further decouple the "front-end" from the "back-end", and ensure no performance differences between "native" and "virtual" tables. Of course, the above is a "naive" abstract reflection which ignores the realities of VDBE, so it may sound rubbish to actual SQLite developers. Apologies for that. --DD ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
Also, i want to note the pretty clever "hack" from Hick Gunter that works by passing a "proxy" for the row (instead of the whole row's values) to SQLite. This proxy gets expanded using the xColumn API by a virtual table outside the whole query. In this way a query using multiple VTs, only needs to incur the xColumn performance penalty only once. Nevertheless, having people do crazy/clever hacks like that to avoid the inefficiencies of an API, gives a hint that something might be problematic. In a previous email of mine, i had said half seriously/half joking about the cost in Watts of SQLite's worldwide usage. For something that is so widely used, even some % of efficiency improvement really makes a difference. It is not an "SQLite destroys/saves civilization" kind of difference, but IMHO it would be measurable in G/M Watts. l. On 4/3/2014 11:33 μμ, Alek Paunov wrote: On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote: On 04/03/14 20:11, Alek Paunov wrote: On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote: On 03/03/14 03:01, Alek Paunov wrote: It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. It seems that I do not know the sqlite3_ API very well. http://www.sqlite.org/c3ref/funclist.html Would you like to point me to the "bind" API page? In the link that you posted above, look for all the sqlite3_bind_x functions. In SQLite the bind API is used to pass parameters to prepared statements. The way the the bind API works is that you have a statement parameter "row", and you fill it by saying: Set column 1 of statement parameter "row" to an int with value 10 Set column 2 of statement parameter "row" to an float with value 3.5 ... So instead of SQLite calling back for each column, in the bind API "way", the program calls SQLite to fill a row's values. Ah, OK. Let see if I finally understood your idea: - exec "insert into t select * from vt" - VDBE calls xNextRow - in xNextRow implementation, the Python code calls something like bind_xxx for each column with the scalar addresses (allocated by you) - xNextRow returns, VDBE inserts the row, you clean on next step Questions: What stops you to make this wrapper right now (e.g. as apsw patch or standalone sqlite add-on loaded by PyPy FFI)? How you expect this model (managing one per cell count of scalar allocations during the query) to perform in comparison with passing encoded row pages (memory chinks) between sqlite and the script engine especially when it is not PyPy or LuaJIT? Regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On 4/3/2014 11:33 μμ, Alek Paunov wrote: On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote: On 04/03/14 20:11, Alek Paunov wrote: On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote: On 03/03/14 03:01, Alek Paunov wrote: It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. It seems that I do not know the sqlite3_ API very well. http://www.sqlite.org/c3ref/funclist.html Would you like to point me to the "bind" API page? In the link that you posted above, look for all the sqlite3_bind_x functions. In SQLite the bind API is used to pass parameters to prepared statements. The way the the bind API works is that you have a statement parameter "row", and you fill it by saying: Set column 1 of statement parameter "row" to an int with value 10 Set column 2 of statement parameter "row" to an float with value 3.5 ... So instead of SQLite calling back for each column, in the bind API "way", the program calls SQLite to fill a row's values. Ah, OK. Let see if I finally understood your idea: - exec "insert into t select * from vt" - VDBE calls xNextRow - in xNextRow implementation, the Python code calls something like bind_xxx for each column with the scalar addresses (allocated by you) - xNextRow returns, VDBE inserts the row, you clean on next step Questions: What stops you to make this wrapper right now (e.g. as apsw patch or standalone sqlite add-on loaded by PyPy FFI)? Yes these things can be done. We have written our own APSW compatible FFI for PyPy, in which we have tried techniques like the one you describe. The end result wasn't that much faster. Because no matter how many wrappers there are above VT's API, the efficiency characteristics of it are still there. How you expect this model (managing one per cell count of scalar allocations during the query) to perform in comparison with passing encoded row pages (memory chinks) between sqlite and the script engine especially when it is not PyPy or LuaJIT? Essentially in my first post about VT's API efficiency the comparison was between the bind API (similar to your "encoded pages" idea but for a single row), and the current xColumn based VT API. In our tests with Python, the difference between the two APIs was 3x. I've seen another post here by somebody else who was using a native compiled VT implementation saying that the difference was 2x. Both of these observations say that there is some inefficiency in the current situation with the VT API. If this API is also used within SQLite's VM, maybe adding an xNextRow opcode, would speedup SQLite's VM too? Best, l. Regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote: On 04/03/14 20:11, Alek Paunov wrote: On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote: On 03/03/14 03:01, Alek Paunov wrote: It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. It seems that I do not know the sqlite3_ API very well. http://www.sqlite.org/c3ref/funclist.html Would you like to point me to the "bind" API page? In the link that you posted above, look for all the sqlite3_bind_x functions. In SQLite the bind API is used to pass parameters to prepared statements. The way the the bind API works is that you have a statement parameter "row", and you fill it by saying: Set column 1 of statement parameter "row" to an int with value 10 Set column 2 of statement parameter "row" to an float with value 3.5 ... So instead of SQLite calling back for each column, in the bind API "way", the program calls SQLite to fill a row's values. Ah, OK. Let see if I finally understood your idea: - exec "insert into t select * from vt" - VDBE calls xNextRow - in xNextRow implementation, the Python code calls something like bind_xxx for each column with the scalar addresses (allocated by you) - xNextRow returns, VDBE inserts the row, you clean on next step Questions: What stops you to make this wrapper right now (e.g. as apsw patch or standalone sqlite add-on loaded by PyPy FFI)? How you expect this model (managing one per cell count of scalar allocations during the query) to perform in comparison with passing encoded row pages (memory chinks) between sqlite and the script engine especially when it is not PyPy or LuaJIT? Regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On 04/03/14 20:11, Alek Paunov wrote: On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote: On 03/03/14 03:01, Alek Paunov wrote: It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. It seems that I do not know the sqlite3_ API very well. http://www.sqlite.org/c3ref/funclist.html Would you like to point me to the "bind" API page? In the link that you posted above, look for all the sqlite3_bind_x functions. In SQLite the bind API is used to pass parameters to prepared statements. The way the the bind API works is that you have a statement parameter "row", and you fill it by saying: Set column 1 of statement parameter "row" to an int with value 10 Set column 2 of statement parameter "row" to an float with value 3.5 ... So instead of SQLite calling back for each column, in the bind API "way", the program calls SQLite to fill a row's values. Regards. l. For a paging API (which IMHO is too complex ATM), the bind API could be extended with a row number parameter. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote: On 03/03/14 03:01, Alek Paunov wrote: It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. It seems that I do not know the sqlite3_ API very well. http://www.sqlite.org/c3ref/funclist.html Would you like to point me to the "bind" API page? For a paging API (which IMHO is too complex ATM), the bind API could be extended with a row number parameter. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
Thank you for the explanation. Answers inline. On 04/03/14 16:16, Hick Gunter wrote: Your VT1 table already has an xColumn implementation, possibly doing something like switch( p_column ) { case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break; ... case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break; } This needs to have two cases added: case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec ); case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func ); where static int func( p_rec, p_ctx, p_column ); calls xColumn( v_cursor, p_ctx, p_column ); with a dummy cursor structure as defined for your table. The VT2 table can then prepare "select __rec,__func from VT1", and in its xColumn implementation it calls v_rec = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be stored and cleared in the xNext function v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be stored v_func( v_rec, p_ctx, p_column ); I see, so you do a similar trick as what we do with passing Python's generators as values in SQLite. As for your second example, as written it does not suffer from the effect because you are already selecting c1, c2 and c3 at the bottom level. Rewritten as Select processrow(c1,c2,c3) from VT2(select * from VT1); Without knowing what VT2 will do, I don't think that this rewritting can happen. For example, "processrow" might return generators (nested tables), that get expanded by VT2. If you moved it outside VT2, then the generators would not be expanded. Regards, l. results in the VT1 xColumn function getting called (via the VT2 xColumn function) just 3 times per row. Additionally, you may like to "select __func from VT1 limit 1" and store that in your xFilter implementation; and then "select __rec from VT1" in your xNext implementation to have sqlite3_result_int64() called half as often. HTH -Ursprüngliche Nachricht- Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com] Gesendet: Dienstag, 04. März 2014 14:15 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance Could you explain some more your solution? Does it work in this following case: select * from VT2(select * from VT1); by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)? What would happen in the following case?: select * from VT2(select processrow(c1,c2,c3) from VT1); Regards, l. On 03/03/14 14:17, Hick Gunter wrote: We have gotten around this problem by defining "virtual" fields that contain a reference to the "current record" and the entrypoint of a wrapper around the xColumn function. That way only two fields get passed upwards through the virtual table stack and the top level virtual table's xColumn implementation calls straight through to the bottom layer's wrapper. It does take some care to avoid sorting in between the layers and re-preparation of statements on schema changes. -Ursprüngliche Nachricht- Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com] Gesendet: Sonntag, 02. März 2014 20:39 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance We have both input and output virtual tables that avoid hitting the hard disk and are also able to compress the incoming and outgoing data. We have a virtual table that takes as input a query and sends the data to a port on another machine. This virtual table is called "OUTPUT". And another virtual table that takes as input data from another port and forwards it into SQLite. Lets call it "INPUT". A query that uses these two virtual tables would look like this in madIS: OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081'); We actually use queries like above (actually we don't do it directly to ports but to buffered named pipes that are then forwarded via netcat) to run distributed queries on clusters, connecting all the local SQLite/madIS instances on the different machines together. The main point that i want to make with above explanation is that we don't view SQLite only as a traditional database. We also view it as a data stream processing machine, that doesn't have the requirement for the data to be stored on a hard disk. Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. estama On 2/3/2014 9:34 ìì, Max Vlasov wrote: On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis <est...@gmail.com> wrote: Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of TPCH contains 16 columns, which for 10M rows would require 160M xColumn callbacks, to pass it through the virtual table API. These ca
Re: [sqlite] Virtual table API performance
In our tests the bind API can acquire from the Python side more than 20 values in a single call, at the same time that xColumn acquires 2 values. Most of the cost is in the callback and not in submitting a row's values through bind's API . So with the exception of queries that need only 1 column, IMHO everything else should go through the xNextRow API. To keep the complexity to the lowest minimum, my proposal is to use xNextRow API only for queries that only "scan" over a virtual table (no filtering). l. On 04/03/14 18:23, Hick Gunter wrote: My guess: Yes. It would require implementing an new opcode, either only for virtual tables or also for native tables too, that accepts a list of field numbers (currently there are only 5 parameters possible for an opcode and some of them have fixed meanings). And the logic to generate theses opcodes based on the capabilities of the loaded table module combined with the requirements of the subject query (fields required for JOIN are fetched separately from those required for the result set) and the result of the xBestIndex calls (where it is possible to set the "omit" flag to suppress generation of a comparison). This also adds to the complexity of register allocation. Take for example a join that needs 3 fields for the comparison, 2 of which are also required for the result set of 7 fields total. Do you request all 10 fields at once (wastes up to 9 fields worth of effort if the JOIN is not met)? Or the 3 fields first and the 5 others only if the join matches (must allocate consecutive registers to build a result set)? Or 3 first and then 7 (which approximates the current behavior, as the 2 common fields are fetched twice on a match)? And a set of new sqlite3_result routines that specify which of the various requested fields' value is being set. -Ursprüngliche Nachricht- Von: J. Merrill [mailto:j.merr...@enlyton.com] Gesendet: Dienstag, 04. März 2014 16:23 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance Eleytherios Stamatogiannakis wrote Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of TPCH contains 16 columns, which for 10M rows would require 160M xColumn callbacks, to pass it through the virtual table API. These callbacks are very expensive, especially when at the other end sits a VM (CPython or PyPy) handling them. Would it be very difficult to arrange for an option that would request that SQLite issue a single more-complex xMultiColumns (a sample name) callback request, with a way for multiple results to be returned, rather than many xColumn callbacks? This would reduce the number of calls across the VM boundary. Applications that don't implement xMultiColumns (and request its use) would see no change; those that do would get the performance boost. J. Merrill -- View this message in context: http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users --- Gunter Hick Software Engineer Scientific Games International GmbH Klitschgasse 2 – 4, A - 1130 Vienna, Austria FN 157284 a, HG Wien Tel: +43 1 80100 0 E-Mail: h...@scigames.at This e-mail is confidential and may well also be legally privileged. If you have received it in error, you are on notice as to its status and accordingly please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any person as to do so could be a breach of confidence. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
My guess: Yes. It would require implementing an new opcode, either only for virtual tables or also for native tables too, that accepts a list of field numbers (currently there are only 5 parameters possible for an opcode and some of them have fixed meanings). And the logic to generate theses opcodes based on the capabilities of the loaded table module combined with the requirements of the subject query (fields required for JOIN are fetched separately from those required for the result set) and the result of the xBestIndex calls (where it is possible to set the "omit" flag to suppress generation of a comparison). This also adds to the complexity of register allocation. Take for example a join that needs 3 fields for the comparison, 2 of which are also required for the result set of 7 fields total. Do you request all 10 fields at once (wastes up to 9 fields worth of effort if the JOIN is not met)? Or the 3 fields first and the 5 others only if the join matches (must allocate consecutive registers to build a result set)? Or 3 first and then 7 (which approximates the current behavior, as the 2 common fields are fetched twice on a match)? And a set of new sqlite3_result routines that specify which of the various requested fields' value is being set. -Ursprüngliche Nachricht- Von: J. Merrill [mailto:j.merr...@enlyton.com] Gesendet: Dienstag, 04. März 2014 16:23 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance Eleytherios Stamatogiannakis wrote > Our main test case is TPCH, a standard DB benchmark. The "lineitem" > table of TPCH contains 16 columns, which for 10M rows would require > 160M xColumn callbacks, to pass it through the virtual table API. > These callbacks are very expensive, especially when at the other end > sits a VM (CPython or PyPy) handling them. Would it be very difficult to arrange for an option that would request that SQLite issue a single more-complex xMultiColumns (a sample name) callback request, with a way for multiple results to be returned, rather than many xColumn callbacks? This would reduce the number of calls across the VM boundary. Applications that don't implement xMultiColumns (and request its use) would see no change; those that do would get the performance boost. J. Merrill -- View this message in context: http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users --- Gunter Hick Software Engineer Scientific Games International GmbH Klitschgasse 2 – 4, A - 1130 Vienna, Austria FN 157284 a, HG Wien Tel: +43 1 80100 0 E-Mail: h...@scigames.at This e-mail is confidential and may well also be legally privileged. If you have received it in error, you are on notice as to its status and accordingly please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any person as to do so could be a breach of confidence. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
Eleytherios Stamatogiannakis wrote > Our main test case is TPCH, a standard DB benchmark. The "lineitem" > table of TPCH contains 16 columns, which for 10M rows would require 160M > xColumn callbacks, to pass it through the virtual table API. These > callbacks are very expensive, especially when at the other end sits a VM > (CPython or PyPy) handling them. Would it be very difficult to arrange for an option that would request that SQLite issue a single more-complex xMultiColumns (a sample name) callback request, with a way for multiple results to be returned, rather than many xColumn callbacks? This would reduce the number of calls across the VM boundary. Applications that don't implement xMultiColumns (and request its use) would see no change; those that do would get the performance boost. J. Merrill -- View this message in context: http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
Your VT1 table already has an xColumn implementation, possibly doing something like switch( p_column ) { case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break; ... case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break; } This needs to have two cases added: case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec ); case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func ); where static int func( p_rec, p_ctx, p_column ); calls xColumn( v_cursor, p_ctx, p_column ); with a dummy cursor structure as defined for your table. The VT2 table can then prepare "select __rec,__func from VT1", and in its xColumn implementation it calls v_rec = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be stored and cleared in the xNext function v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be stored v_func( v_rec, p_ctx, p_column ); As for your second example, as written it does not suffer from the effect because you are already selecting c1, c2 and c3 at the bottom level. Rewritten as Select processrow(c1,c2,c3) from VT2(select * from VT1); results in the VT1 xColumn function getting called (via the VT2 xColumn function) just 3 times per row. Additionally, you may like to "select __func from VT1 limit 1" and store that in your xFilter implementation; and then "select __rec from VT1" in your xNext implementation to have sqlite3_result_int64() called half as often. HTH -Ursprüngliche Nachricht- Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com] Gesendet: Dienstag, 04. März 2014 14:15 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance Could you explain some more your solution? Does it work in this following case: select * from VT2(select * from VT1); by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)? What would happen in the following case?: select * from VT2(select processrow(c1,c2,c3) from VT1); Regards, l. On 03/03/14 14:17, Hick Gunter wrote: > We have gotten around this problem by defining "virtual" fields that contain > a reference to the "current record" and the entrypoint of a wrapper around > the xColumn function. That way only two fields get passed upwards through the > virtual table stack and the top level virtual table's xColumn implementation > calls straight through to the bottom layer's wrapper. > > It does take some care to avoid sorting in between the layers and > re-preparation of statements on schema changes. > > -Ursprüngliche Nachricht- > Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com] > Gesendet: Sonntag, 02. März 2014 20:39 > An: sqlite-users@sqlite.org > Betreff: Re: [sqlite] Virtual table API performance > > We have both input and output virtual tables that avoid hitting the hard disk > and are also able to compress the incoming and outgoing data. > > We have a virtual table that takes as input a query and sends the data to a > port on another machine. This virtual table is called "OUTPUT". And another > virtual table that takes as input data from another port and forwards it into > SQLite. Lets call it "INPUT". A query that uses these two virtual tables > would look like this in madIS: > > OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081'); > > We actually use queries like above (actually we don't do it directly to ports > but to buffered named pipes that are then forwarded via netcat) to run > distributed queries on clusters, connecting all the local SQLite/madIS > instances on the different machines together. > > The main point that i want to make with above explanation is that we don't > view SQLite only as a traditional database. We also view it as a data stream > processing machine, that doesn't have the requirement for the data to be > stored on a hard disk. > > Under this view, the efficiency of the virtual table api is very important. > Above query only uses 2 VTs in it, but we have other queries that use a lot > more VTs than that. > > estama > > > On 2/3/2014 9:34 ìì, Max Vlasov wrote: >> On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis >> <est...@gmail.com> wrote: >>> >>> Our main test case is TPCH, a standard DB benchmark. The "lineitem" >>> table of TPCH contains 16 columns, which for 10M rows would require >>> 160M xColumn callbacks, to pass it through the virtual table API. >>> These callbacks are very expensive, especially when at the other end >>> sits a VM (CPython or PyPy) handling them. >>> >> >> Ok, not stating that the performance improvment is
Re: [sqlite] Virtual table API performance
Could you explain some more your solution? Does it work in this following case: select * from VT2(select * from VT1); by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)? What would happen in the following case?: select * from VT2(select processrow(c1,c2,c3) from VT1); Regards, l. On 03/03/14 14:17, Hick Gunter wrote: We have gotten around this problem by defining "virtual" fields that contain a reference to the "current record" and the entrypoint of a wrapper around the xColumn function. That way only two fields get passed upwards through the virtual table stack and the top level virtual table's xColumn implementation calls straight through to the bottom layer's wrapper. It does take some care to avoid sorting in between the layers and re-preparation of statements on schema changes. -Ursprüngliche Nachricht- Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com] Gesendet: Sonntag, 02. März 2014 20:39 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance We have both input and output virtual tables that avoid hitting the hard disk and are also able to compress the incoming and outgoing data. We have a virtual table that takes as input a query and sends the data to a port on another machine. This virtual table is called "OUTPUT". And another virtual table that takes as input data from another port and forwards it into SQLite. Lets call it "INPUT". A query that uses these two virtual tables would look like this in madIS: OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081'); We actually use queries like above (actually we don't do it directly to ports but to buffered named pipes that are then forwarded via netcat) to run distributed queries on clusters, connecting all the local SQLite/madIS instances on the different machines together. The main point that i want to make with above explanation is that we don't view SQLite only as a traditional database. We also view it as a data stream processing machine, that doesn't have the requirement for the data to be stored on a hard disk. Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. estama On 2/3/2014 9:34 ìì, Max Vlasov wrote: On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis <est...@gmail.com> wrote: Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of TPCH contains 16 columns, which for 10M rows would require 160M xColumn callbacks, to pass it through the virtual table API. These callbacks are very expensive, especially when at the other end sits a VM (CPython or PyPy) handling them. Ok, not stating that the performance improvment is impossible, I will explain why I'm a little sceptical about it. For every bulk insert we have a theoretical maxiumum we'd all glad to see sqlite would perform with - the speed of simple file copying. Sqlite can't be faster than that, but to be on par is a good goal. This is not possible when an insert means also modification of other parts of the file, for example when there's an index involved. But let's forget about it. Finally when new data is added, sqlite should write a number of database pages, the cost of this part is absolutely in the hands of the media (driver) and OS (driver). But for every database page write there's also price to pay in CPU units, for many actions sqlite should do before actual value is translated from what the developer provided to what actually appears on disk. The illustration of the CPU price is the following example CREATE TABLE t(Value) on my ssd drive mulitply inserts (thousands) insert into t (Value) values ('123456689 // this string contains many symbols, for example 1024) performed with the speed 30 MB/Sec but the query insert into t (Value) values (10) // this is a small integer value only 3 Mb/Sec Both shows almost full cpu load. Why such difference? Because with latter query the system can do more than 30 MB of writes in 1 second, but it should wait for sqlite spending 10 seconds in preparations. The former is better because CPU cost of passing a large text value to sqlite is comparatively low comparing to the time spent in I/O in writing this on disk. So CPU price to pay isn't avoidable and notice that in example this is not virtual table API, this is bind API. I suppose that the price we pay for CPU spent in virtual table API is on par with an average price payed in sqlite as a whole. This means that if I transfom the avove queries into inserts from virtual tables, the final speed difference will be similar. And this also means that for your comparision tests (when you get x3 difference), the CPU price sqlite pays inside bind api and in its code wrapping xColumn call is probably similar. The rest is the share your code pays. Well
Re: [sqlite] Virtual table API performance
On 03/03/14 03:01, Alek Paunov wrote: On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote: Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. Max tests in C shows 2x CPU work, but he explains that the test is not very sound, so let's say somewhere between 1x-2x. Your tests - 3x time. As you have already identified, the real reason probably is the million scale callback quantity across the VM barrier - I do not follow PyPy, but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the leading project in the trace compilers filed): [1] http://luajit.org/ext_ffi_semantics.html#callback_performance Also from one of the dozens of threads touching the subject: [2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3 ``` Entering the VM needs a lot of state setup and leaving it isn't free either. Constantly entering and leaving the VM via a callback from C *to* Lua has a high overhead. For short callbacks, the switching overhead between C and Lua may completely dominate the total CPU time. Calling an iterator written in C via the FFI *from* a Lua program is much cheaper -- this compiles down to a simple call instruction. ``` I remember that i had seen above quote from Mike Pall, but i couldn't find it. Thank you for unearthing it. Unfortunately, for your "insert into t select * from vt" case an the callback/iterator transformation is not possible (we do not have repetitive _step call to invert the control somehow). What to do? It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). SQLite doesn't need any special format for the records to be passed over to it. It already has the "bind" API which would be very suitable for the xNextRow function too. For a paging API (which IMHO is too complex ATM), the bind API could be extended with a row number parameter. Regards, estama Kind regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
We have gotten around this problem by defining "virtual" fields that contain a reference to the "current record" and the entrypoint of a wrapper around the xColumn function. That way only two fields get passed upwards through the virtual table stack and the top level virtual table's xColumn implementation calls straight through to the bottom layer's wrapper. It does take some care to avoid sorting in between the layers and re-preparation of statements on schema changes. -Ursprüngliche Nachricht- Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com] Gesendet: Sonntag, 02. März 2014 20:39 An: sqlite-users@sqlite.org Betreff: Re: [sqlite] Virtual table API performance We have both input and output virtual tables that avoid hitting the hard disk and are also able to compress the incoming and outgoing data. We have a virtual table that takes as input a query and sends the data to a port on another machine. This virtual table is called "OUTPUT". And another virtual table that takes as input data from another port and forwards it into SQLite. Lets call it "INPUT". A query that uses these two virtual tables would look like this in madIS: OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081'); We actually use queries like above (actually we don't do it directly to ports but to buffered named pipes that are then forwarded via netcat) to run distributed queries on clusters, connecting all the local SQLite/madIS instances on the different machines together. The main point that i want to make with above explanation is that we don't view SQLite only as a traditional database. We also view it as a data stream processing machine, that doesn't have the requirement for the data to be stored on a hard disk. Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. estama On 2/3/2014 9:34 ìì, Max Vlasov wrote: > On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis > <est...@gmail.com> wrote: >> >> Our main test case is TPCH, a standard DB benchmark. The "lineitem" >> table of TPCH contains 16 columns, which for 10M rows would require >> 160M xColumn callbacks, to pass it through the virtual table API. >> These callbacks are very expensive, especially when at the other end >> sits a VM (CPython or PyPy) handling them. >> > > Ok, not stating that the performance improvment is impossible, I will > explain why I'm a little sceptical about it. > > For every bulk insert we have a theoretical maxiumum we'd all glad to > see sqlite would perform with - the speed of simple file copying. > Sqlite can't be faster than that, but to be on par is a good goal. > This is not possible when an insert means also modification of other > parts of the file, for example when there's an index involved. But > let's forget about it. Finally when new data is added, sqlite should > write a number of database pages, the cost of this part is absolutely > in the hands of the media (driver) and OS (driver). But for every > database page write there's also price to pay in CPU units, for many > actions sqlite should do before actual value is translated from what > the developer provided to what actually appears on disk. > > The illustration of the CPU price is the following example > CREATE TABLE t(Value) > > on my ssd drive mulitply inserts (thousands) >insert into t (Value) values ('123456689 // this string > contains many symbols, for example 1024) performed with the speed >30 MB/Sec > > but the query >insert into t (Value) values (10) // this is a small integer > value only >3 Mb/Sec > > Both shows almost full cpu load. Why such difference? Because with > latter query the system can do more than 30 MB of writes in 1 second, > but it should wait for sqlite spending 10 seconds in preparations. > The former is better because CPU cost of passing a large text value to > sqlite is comparatively low comparing to the time spent in I/O in > writing this on disk. > > So CPU price to pay isn't avoidable and notice that in example this is > not virtual table API, this is bind API. I suppose that the price we > pay for CPU spent in virtual table API is on par with an average price > payed in sqlite as a whole. This means that if I transfom the avove > queries into inserts from virtual tables, the final speed difference > will be similar. And this also means that for your comparision tests > (when you get x3 difference), the CPU price sqlite pays inside bind > api and in its code wrapping xColumn call is probably similar. The > rest is the share your code pays. > > Well, I know that there are differences in CPU architec
Re: [sqlite] Virtual table API performance
On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote: Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. Max tests in C shows 2x CPU work, but he explains that the test is not very sound, so let's say somewhere between 1x-2x. Your tests - 3x time. As you have already identified, the real reason probably is the million scale callback quantity across the VM barrier - I do not follow PyPy, but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the leading project in the trace compilers filed): [1] http://luajit.org/ext_ffi_semantics.html#callback_performance Also from one of the dozens of threads touching the subject: [2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3 ``` Entering the VM needs a lot of state setup and leaving it isn't free either. Constantly entering and leaving the VM via a callback from C *to* Lua has a high overhead. For short callbacks, the switching overhead between C and Lua may completely dominate the total CPU time. Calling an iterator written in C via the FFI *from* a Lua program is much cheaper -- this compiles down to a simple call instruction. ``` Unfortunately, for your "insert into t select * from vt" case an the callback/iterator transformation is not possible (we do not have repetitive _step call to invert the control somehow). What to do? It seems that the easiest optimization for this (very often) VT use case (bulk streaming) is SQLite add-on in _C_ to be written, implementing vtable interface specialization containing xNextPage "buffering" let's say 4K rows or even better 16KB data (in addition to your initial proposal of xNextRow). The technical question is: how the rows to be encoded? You said initially that you use some compressed format. But for such extension, to gain more traction in the future, it would be better probably a more standard format to be chosen. a) Rows represented in native SQLite3 format [3] b) ... native SQLite4 format c) Some wide used encoding near to SQLite types [4] d) ... [3] http://www.sqlite.org/fileformat.html#record_format [4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats IMHO, It would be very nice if some common conventions for binary record streaming could be discussed and adopted across the SQLite binding and add-on developers. The possible applications are not limited only to vtables ;-). Kind regards, Alek ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
We have both input and output virtual tables that avoid hitting the hard disk and are also able to compress the incoming and outgoing data. We have a virtual table that takes as input a query and sends the data to a port on another machine. This virtual table is called "OUTPUT". And another virtual table that takes as input data from another port and forwards it into SQLite. Lets call it "INPUT". A query that uses these two virtual tables would look like this in madIS: OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081'); We actually use queries like above (actually we don't do it directly to ports but to buffered named pipes that are then forwarded via netcat) to run distributed queries on clusters, connecting all the local SQLite/madIS instances on the different machines together. The main point that i want to make with above explanation is that we don't view SQLite only as a traditional database. We also view it as a data stream processing machine, that doesn't have the requirement for the data to be stored on a hard disk. Under this view, the efficiency of the virtual table api is very important. Above query only uses 2 VTs in it, but we have other queries that use a lot more VTs than that. estama On 2/3/2014 9:34 μμ, Max Vlasov wrote: On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakiswrote: Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of TPCH contains 16 columns, which for 10M rows would require 160M xColumn callbacks, to pass it through the virtual table API. These callbacks are very expensive, especially when at the other end sits a VM (CPython or PyPy) handling them. Ok, not stating that the performance improvment is impossible, I will explain why I'm a little sceptical about it. For every bulk insert we have a theoretical maxiumum we'd all glad to see sqlite would perform with - the speed of simple file copying. Sqlite can't be faster than that, but to be on par is a good goal. This is not possible when an insert means also modification of other parts of the file, for example when there's an index involved. But let's forget about it. Finally when new data is added, sqlite should write a number of database pages, the cost of this part is absolutely in the hands of the media (driver) and OS (driver). But for every database page write there's also price to pay in CPU units, for many actions sqlite should do before actual value is translated from what the developer provided to what actually appears on disk. The illustration of the CPU price is the following example CREATE TABLE t(Value) on my ssd drive mulitply inserts (thousands) insert into t (Value) values ('123456689 // this string contains many symbols, for example 1024) performed with the speed 30 MB/Sec but the query insert into t (Value) values (10) // this is a small integer value only 3 Mb/Sec Both shows almost full cpu load. Why such difference? Because with latter query the system can do more than 30 MB of writes in 1 second, but it should wait for sqlite spending 10 seconds in preparations. The former is better because CPU cost of passing a large text value to sqlite is comparatively low comparing to the time spent in I/O in writing this on disk. So CPU price to pay isn't avoidable and notice that in example this is not virtual table API, this is bind API. I suppose that the price we pay for CPU spent in virtual table API is on par with an average price payed in sqlite as a whole. This means that if I transfom the avove queries into inserts from virtual tables, the final speed difference will be similar. And this also means that for your comparision tests (when you get x3 difference), the CPU price sqlite pays inside bind api and in its code wrapping xColumn call is probably similar. The rest is the share your code pays. Well, I know that there are differences in CPU architectures and probably there are platform where compiled code for bind api and virtual tables api behaves a little differently making the costs more diffrent. But imagine that hard task of fine tuning and refactoring just to get a noticeable difference for a particular platform. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakiswrote: > > Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of > TPCH contains 16 columns, which for 10M rows would require 160M xColumn > callbacks, to pass it through the virtual table API. These callbacks are > very expensive, especially when at the other end sits a VM (CPython or PyPy) > handling them. > Ok, not stating that the performance improvment is impossible, I will explain why I'm a little sceptical about it. For every bulk insert we have a theoretical maxiumum we'd all glad to see sqlite would perform with - the speed of simple file copying. Sqlite can't be faster than that, but to be on par is a good goal. This is not possible when an insert means also modification of other parts of the file, for example when there's an index involved. But let's forget about it. Finally when new data is added, sqlite should write a number of database pages, the cost of this part is absolutely in the hands of the media (driver) and OS (driver). But for every database page write there's also price to pay in CPU units, for many actions sqlite should do before actual value is translated from what the developer provided to what actually appears on disk. The illustration of the CPU price is the following example CREATE TABLE t(Value) on my ssd drive mulitply inserts (thousands) insert into t (Value) values ('123456689 // this string contains many symbols, for example 1024) performed with the speed 30 MB/Sec but the query insert into t (Value) values (10) // this is a small integer value only 3 Mb/Sec Both shows almost full cpu load. Why such difference? Because with latter query the system can do more than 30 MB of writes in 1 second, but it should wait for sqlite spending 10 seconds in preparations. The former is better because CPU cost of passing a large text value to sqlite is comparatively low comparing to the time spent in I/O in writing this on disk. So CPU price to pay isn't avoidable and notice that in example this is not virtual table API, this is bind API. I suppose that the price we pay for CPU spent in virtual table API is on par with an average price payed in sqlite as a whole. This means that if I transfom the avove queries into inserts from virtual tables, the final speed difference will be similar. And this also means that for your comparision tests (when you get x3 difference), the CPU price sqlite pays inside bind api and in its code wrapping xColumn call is probably similar. The rest is the share your code pays. Well, I know that there are differences in CPU architectures and probably there are platform where compiled code for bind api and virtual tables api behaves a little differently making the costs more diffrent. But imagine that hard task of fine tuning and refactoring just to get a noticeable difference for a particular platform. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Virtual table API performance
In our performance tests we try to work with data and queries that are representative of what we would find in a typical DB. This means a lot of "small" values (ints, floats, small strings), and 5-20 columns. Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of TPCH contains 16 columns, which for 10M rows would require 160M xColumn callbacks, to pass it through the virtual table API. These callbacks are very expensive, especially when at the other end sits a VM (CPython or PyPy) handling them. For PyPy in particular, which is able to produce JIT compiled and optimized UDF code, adapted on the "shape" (type and value distribution) of the data flows as they pass through it, every time it faces the virtual table API it slows down to a crawl, becoming more than 2x slower than interpreted Python. This happens because it cannot see the data/row "shape" from the many small and unrelated between each other, single value based, xColumn callbacks. Changing the subject, i've seen some requests in previous emails from people asking for windowing functions to be added to SQLite. I want to propose an alternative that we have been using for years, and is a lot more generic than adding specific functions for very "narrow" use cases in SQLite. We have added the "EXPAND" VT function in madIS, which "emulates" nested tables in SQLite, enabling to have row and aggregate functions that return (in a streaming fashion) multiple values on multiple columns. The "EXPAND" function, takes as input a table containing as values (in our case Python) generators, and then it calls the generators "expanding" the input table to its final form. "EXPAND" is automatically inserted wherever is required, so it isn't visible. An example follows: > select strsplit('one and other'); one|and|other <-- 3 columns or > select strsplitV('one and other'); one and<-- 3 individual rows other So by adding a single VT function and some syntactic sugar (auto inserting EXPAND VT), we were able to have functionality that is not case specific, allowing us to run all kinds of analytics inside SQLite. The performance of above functionality is already very good. But it could be a lot better with a more efficient VT API. Regards, estama On 2/3/2014 9:15 πμ, Max Vlasov wrote: Hi, thanks for explaining your syntax in another post. Now about virtual tables if you don't mind. On Fri, Feb 28, 2014 at 8:24 PM, Eleytherios Stamatogiannakiswrote: If we load into SQLite, create table newtable as select * from READCOMPRESSEDFILE('ctable.rc'); it takes: 55 sec If we create an external program it takes: 19 sec (~3x faster than using the virtual table API) Looking at your numbers, as a user (and fan :) of virtual tables I decided to do some tests. I have a virtual table "all values", it was designed for enumeration of all tables values to the one single virtual table, so finally it is a long list of TableName, TableRowId, FieldName, Value so you get the idea. As an example of what it may do, you may open places.sqlite of mozilla browser and do Select * from AllValues where Value Like "%sqlite.org%" and see actual results even not knowing how they planned their schema. Internally this virtual table simply uses general selects for all other tables met in sqlite_master. This is a good (but probably not the best) test for measuring virtual tables performance, because SELECT * FROM AllValues is equivalent to reading all conventional tables of this database. Besides - the tool I use has a tweaker implemented with VFS that allows measuring speed and other characteristics of the query performed while the query is in effect. - I have an option that forces resetting windows cache for the database file when it is reopened. So with it we exclude the windows cache from consideration so pure I/O reading is used. Btw, when you do your comparison, it's very important to reset system cache before every measurement that involves I/O. So I took a comparatively large (500 Mb) database consisting of several small and one big table (Posts) and compared two queries. (Query1) Select sum(length(Body) + length(Title)) from Posts This ones effectively reads the table data and uses - length() to force sqlite reading texts that don't fit into single db page - sum() to exclude accumulating results on my side from comparison, so we have a single row, single column result from the work completely done by sqlite. (Query2) Select Sum(Length(Value)) from AllValues This one performs basically the same but using sqlite virtual tables api. It also touches other tables, but since they're small, we can forget about this. Query1 (General): Read: 540MB, Time: 24.2 sec, CPU Time: 6 Sec (25%) Speed: 22.31 MB/Sec Query2 (Virtual): Read: 540MB, Time: 27.3 Sec, CPU Time: 13 sec (51%) Speed: 20 MB/Sec In my particular test the noticeable
Re: [sqlite] Virtual table API performance
Hi, thanks for explaining your syntax in another post. Now about virtual tables if you don't mind. On Fri, Feb 28, 2014 at 8:24 PM, Eleytherios Stamatogiannakiswrote: > > If we load into SQLite, > > create table newtable as select * from READCOMPRESSEDFILE('ctable.rc'); > > it takes: 55 sec > > > If we create an external program > > it takes: 19 sec (~3x faster than using the virtual table API) > > Looking at your numbers, as a user (and fan :) of virtual tables I decided to do some tests. I have a virtual table "all values", it was designed for enumeration of all tables values to the one single virtual table, so finally it is a long list of TableName, TableRowId, FieldName, Value so you get the idea. As an example of what it may do, you may open places.sqlite of mozilla browser and do Select * from AllValues where Value Like "%sqlite.org%" and see actual results even not knowing how they planned their schema. Internally this virtual table simply uses general selects for all other tables met in sqlite_master. This is a good (but probably not the best) test for measuring virtual tables performance, because SELECT * FROM AllValues is equivalent to reading all conventional tables of this database. Besides - the tool I use has a tweaker implemented with VFS that allows measuring speed and other characteristics of the query performed while the query is in effect. - I have an option that forces resetting windows cache for the database file when it is reopened. So with it we exclude the windows cache from consideration so pure I/O reading is used. Btw, when you do your comparison, it's very important to reset system cache before every measurement that involves I/O. So I took a comparatively large (500 Mb) database consisting of several small and one big table (Posts) and compared two queries. (Query1) Select sum(length(Body) + length(Title)) from Posts This ones effectively reads the table data and uses - length() to force sqlite reading texts that don't fit into single db page - sum() to exclude accumulating results on my side from comparison, so we have a single row, single column result from the work completely done by sqlite. (Query2) Select Sum(Length(Value)) from AllValues This one performs basically the same but using sqlite virtual tables api. It also touches other tables, but since they're small, we can forget about this. Query1 (General): Read: 540MB, Time: 24.2 sec, CPU Time: 6 Sec (25%) Speed: 22.31 MB/Sec Query2 (Virtual): Read: 540MB, Time: 27.3 Sec, CPU Time: 13 sec (51%) Speed: 20 MB/Sec In my particular test the noticeable difference is at the part of the CPU spent more with the virtual table. I assume this can be related to my own implementation of this virtual table since I should retrieve, store values temporary somewhere and talk to sqlite. But this also may shed light on your performance drop. If your virtual implementation spend much time processing a value, you may finally get a big drop. You may tell that this test is not fair because it does not involve creating a table from the values of a virtual table. Unfortunately I can't create good enough test comparing Posts and AllValues table as sources, because the destination geometry of the tables are different ( Posts have more columns, less rows, AllValue less columns, more rows). The closest approximation was possible when I created an intermediate physical table containing the results from AllValues and compared table creation from this table and from virtual table. The virtual one took longer, but the values - 56 seconds vs 43 second not different enough to conclude something. I'm not sure my tests defend sqlite virtual tables sufficiently, but currently I don't have evidence of significant inefficiency either. Max ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Virtual table API performance
Hello, First of all, i would like to say that, IMHO, SQLite's overall design and code quality is top notch. Having said that, SQLite's virtual table API, while beautifully designed, it is not very efficient. We have been hitting our heads for the last few years, on the virtual table API, to make it work efficiently. During that time, we were thinking that the problem was on our side, in the code that feeds the virtual table API. It turns out that the virtual table API isn't very efficient. The main cause of inefficiency is that it is extremely "chatty". For an external stream that contains many columns, "xColumn" can be called-back hundreds of millions of times for the stream to be consumed by SQLite. These callbacks have a very big cost. Let me describe a test that we did. For our work, we use compressed streams that are being fed in SQLite through the virtual table API. If we load into SQLite, the external compressed stream (containing 3M rows) through the virtual table API: create table newtable as select * from READCOMPRESSEDFILE('ctable.rc'); it takes: 55 sec If we create an external program that inserts into SQLite, the rows in the compressed stream one by one, using "insert into newtable values " and the SQLite bind API: it takes: 19 sec (~3x faster than using the virtual table API) Another problem with the virtual table API, is that it wrecks havok with VM JIT engines. Especially for tracing JIT engines, the many "small" per column callbacks do not permit them to specialize at all, compared to the specialization that a tracing JIT could achieve with one "big" get_an_entire_row callback. A suggestion for improving the efficiency of the virtual table API naturally arises when we look at all the virtual table functions that we have already created. We have ~15 VT functions dealing with importing "all" from external sources (files, http streams, xml, clipboard, etc), and only one filtering VT function (a multidimensional index) "picking" columns to return. So most of our queries that use VTs look like this: create table cliptab as select * from clipboard(); , these queries most of the time select all columns from an external stream. Based on above, an addition that improves the efficiency of the VT API would be an "xRow" function that the SQLite could call to get an entire row back (like the bind API). Even better, and to reduce even more the callback count, would be a "xNextRow" function that returns the contents of the next row or EOF. Regards, estama. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users