Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions
Thanks! this helped me a lot in understanding the red flags. I think I have read the 'type affinity' documentation pages quite a few times (including this morning). Il giorno mer 31 lug 2019 alle ore 12:05 Olivier Mascia ha scritto: > > Le 31 juil. 2019 à 10:03, Barone Ashura a > écrit : > > > > CREATE TABLE "VALUES" ( > > id INTEGER PRIMARY KEY ASC, > > value_int INTEGER, > > value_double REAL, > > value_text TEXT > > ); > > > > and that I want to execute the following simple query: > > > > SELECT * FROM "VALUES"; > > > > For this very specific query, are there circumstances where the call > > to sqlite3_column_type, for column 'value_int', returns a result > different > > from > > SQLITE_INTEGER, > > Yes. > Nothing stops any piece of your own programming or anyone using the CLI to > do: > > INSERT INTO "VALUES"(value_int) VALUES('something'); > > > The sqlite3_column_type for the column value_int will tell you SQLITE_TEXT > because: > > See: https://www.sqlite.org/c3ref/column_blob.html > > "The sqlite3_column_type() routine returns the datatype code for the > initial data type of the result column." > > The "initial data type" is NOT the one declared in the schema. It is the > type of the actual stored value in that particular column of that > particular row. SQLite is mostly type agnostic regarding storage. The > declared type in the schema only serves as a guideline for some automatic > conversion where applicable. SQLite calls this: 'affinity'. > > See: https://www.sqlite.org/datatype3.html#type_affinity > > INSERT INTO "VALUES"(value_int) VALUES(123); > and > INSERT INTO "VALUES"(value_int) VALUES('123'); > > will both actually stores a SQLITE_INTEGER because the column has a type > whose name starts with INT and the string '123' can safely be converted > from ascii to integer and reciprocal. > > But INSERT INTO "VALUES"(value_int) VALUES('something'); > will actually store the string 'something' because it can't be converted > back and forth between text and integer. > > — > Best Regards, Meilleures salutations, Met vriendelijke groeten, Mit besten > Grüßen, > Olivier Mascia > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite3_exec without ubiqitous text conversions
Suggestion Acknowledged :) Il giorno mar 30 lug 2019 alle ore 18:56 ha scritto: > Hi! > > I use a simple trick: > > A clone of the sqlite3_exec that passes the sqlite3_stmt as an argument > to the callback function. > > So no conversion is made and we can use the sqlite3_column... functions > directly on the retrieved row. > > Happy coding! > > > On 2019-07-30 05:00, sqlite-users-requ...@mailinglists.sqlite.org wrote: > > Send sqlite-users mailing list submissions to > > sqlite-users@mailinglists.sqlite.org > > > > 6. sqlite3_exec without ubiqitous text conversions (Barone Ashura) > > > > > > ------ > > > > Message: 6 > > Date: Tue, 30 Jul 2019 13:13:52 +0200 > > From: Barone Ashura > > To: sqlite-users@mailinglists.sqlite.org > > Subject: [sqlite] sqlite3_exec without ubiqitous text conversions > > Message-ID: > > < > cad3lth+nfostqjf50dgsdzuts0exvk7n89amtsjpygx_6b+...@mail.gmail.com> > > Content-Type: text/plain; charset="UTF-8" > > > > Hello, > > > > I have been working on an application based on SQLite for 2 years now, > > and > > recently we started running some performance profiling to check if > > there > > are areas where we can squeeze some extra performance. > > > > SQlite query execution is used almost exclusively through sqlite3_exec, > > and > > the implementation of callbacks. > > > > One of the areas of interest we identified is that every column is > > always > > passed to the callback as a c-char-string. Then we need to convert that > > string back to the intended data type. Apparently a consistent amount > > of > > time is spent doing this. > > > > I started looking into SQLite code and realized that int and real > > values > > retrieved from database, is actually fecthed as binary value, then > > converted to string, passed to the callback, and in the callback we > > convert > > it back to its original data type (as we keep track of 'true' datatype > > for > > each column). > > > > I decided to try skipping these conversions, by creating a custom > > implementation of sqlite3_exec, which does the following in the inside > > 'step' loop: > > > > if (rc == SQLITE_ROW) { > > azVals = azCols; > > for (i = 0; i < nCol; i++) { > > azVals[i] = (char *)sqlite3_column_text(pStmt, > > i); > > int col_type = sqlite3_column_type(pStmt, i); > > switch (col_type) > > { > > case SQLITE_INTEGER: > > case SQLITE_FLOAT: > > { > > azVals[i] = (char*)columnMem(pStmt, i); > > columnMallocFailure(pStmt); > > break; > > } > > default: > > { > > azVals[i] = (char > > *)sqlite3_column_text(pStmt, i); > > break; > > } > > } > > if (!azVals[i] && sqlite3_column_type(pStmt, i) > > != > > SQLITE_NULL) { > > sqlite3OomFault(db); > > goto exec_out; > > } > > } > > } > > > > Instead of ALWAYS converting to text, in case of INTEGER or FLOAT > > columns > > types, I fetch the address of the data through columnMem, put it in > > the > > azVals array, and go on. The callback knows which is the expected data > > type > > for the column, reads and copies data in the destination variable > > through a > > simple assignment. > > > > I am aware that forcing a 'typed' pointer into a generic char* pointer > > (azVals[x]), rings a lot of alarm bells; I am, as well, aware that > > I > > could avoid sqlite3_exec, and call sqlite3_step myself. > > As of now the application is running smoothly and faster as far as > > satabase > > access is concerned. > > > > I am writing here to get opinions about other potential pitfalls or > > oversights in this approach, as well as reason why I should not proceed > > on > > this path. > > > > Thanks in advance > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions
> Why do it this way? Why not write your own custom_sqlite3_exec(...) that uses the standard, > stable, documented interfaces? Because sometimes I take wrong turns despite my best intentions :) But I usually smell it, when I am taking wrong turns, and that is why I can up here asking for advice, insights and enlightment, so I can u-turn, go back at the crossroads, and take another direction :) > custom_sqlite3_exec(...) could call prepare / step / finalize, and use the > standard sqlite3_column_* interfaces to fill a result array. This would be > very little work and could be a drop in replacement of sqlite3_exec. As a > bonus, you can have a proper signature of void* for your callback function > (since it's no longer receiving an array of strings, char** is a lie which > will confuse anyone working on the project in future). Another > maintainability bonus: The name will alert any consumers that this isn't > part of the standard sqlite3 interface. > defining a new callback was one of the option for further development. The different typed callback would certainly be void* for the reason you mentioned. > An assumption you've made may be incorrect: You say that callers know what > data type to expect so that they know how to cast the results (I assume > this is based on the declared column affinity?), but SQLite is not a > strictly typed database. So the actual data might not match what they're > expecting; by going the route you've chosen the exact way you pick up these > errors might be quite far from the source of the error. > This is the thing that I need to get a perfect hold of. When could it happen that 'data might not match what they're expecting'. A small description of the application context and where and how SQLite kicks in. The usage of SQLite is completely wrapped into functions specific for each query being designed and executed, and data from query is returned in typed structures to the callers. Callers are ALWAYS local to the application domain, and there is no possibility (except in the case of poor programming and security, which can always be) for the users (either application users or 'library' users) to try and execute a custom written SQL statement. The make a long story short, I am developing the full stack of datatypes (C structs and/or c++ classes), functions, queries and their middle function calls that define which services are available to access data contained in our SQLITE database. I will try to place here a simple example, supposing we have a simple table as the following: CREATE TABLE VALUES ( id INTEGER PRIMARY KEY ASC, value_int INTEGER, value_double REAL, value_text TEXT, ); and that I want to execute the following simple query: SELECT * FROM VALUES; For this very specific query, are there circumstances where the call to sqlite3_column_type, for column 'value_int', returns a result different from SQLITE_INTEGER, and/or the union located at the base address of Mem returned by columnMem, is being filled as double or even worse, the whole Mem structure is filled as 'TEXT'? (We can rotate the question for the 3 value_XXX fields). Usage of SQLITE functions or casts in the queried results could very likely tamper with the 'expected' datatype. Is this one case case where the assumption is incorrect? > A sane improvement to the interface would be for the consumer of > custom_sqlite3_exec(...) to pass in an array of types that they are > expecting; checking that the types match inside of your custom function > would be fairly trivial (and performant) and you could have properly > defined behaviour for what happens if the types from the database don't > match the exepcted types. Some ideas are: Use SQLite type coercion (just > call sqlite3_int even if it's a string); fail; or skip the record. > Suggestion Acknowledged :) ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions
I really wish to keep the argument polite and constructive, so please dont get me wrong if I reply, I just want to understand what you are referring to, realize and evaluate which is the best way to go for me, not for the sqlite library, that's why I writing to the sqlite library. Il giorno mar 30 lug 2019 alle ore 15:50 Hick Gunter ha scritto: > f) There are exactly 2 documented functions in your code. Did you not read > their documentation??? > > See https://sqlite.org/c3ref/column_blob.html > > " After a type conversion, the result of calling sqlite3_column_type() is > undefined, though harmless. Future versions of SQLite may change the > behavior of sqlite3_column_type() following a type conversion." > which type conversion are you referring to? the second one erroneously called in the example code? I do understand the 'type conversion' to be one of the six functions listed in the page you linked (which was read). The statement before the one you quoted specifically says: "The return value of sqlite3_column_type() can be used to decide which of the first six interface should be used to extract the column value". Isnt this exactly what I want to be doing? Read the column type from the statement, according to the return value call the relevant extraction function. The documentation describes 'automatic conversions' being performed if I am trying to extract a datatype that is different from the internal datatype. But this is exactly what I want to stay away from, unless, of course I am missing something, which I would very like understand. > b) Breaking encapsulation includes referencing internal .h files in your > own code. You should be using sqlite3.h *only* > I am not referencing ANY internal.h file in my own code. I am just using the amalgamation distribution, and I am writing code in sqlite.c, not in my own source files, so nobody outside of sqlite.c calls any function or uses any datatype that is not declared in sqlite.h. > You do realise that declared types are not enforced in SQLite? If somehow > a real with a non-integer value got inserted into an integer field, the > type of the returned value would still be real, but you would be expecting > integer. And silently reading the bits of the double value as an integer, > which would probalby cause a practically untetectable error. > True, agreed. This is the reason why the types of each value being inserted into, updated and read from the database is strictly enforced outside of sqlite. To state in a cear way: SQLite might not enforce declared types, by the application around sqlite has been specifically designed to enforce different types. > > c) There is no guarantee that the internal fields referring to a previous > state of a Mem structure are preserved across conversions. > why do you refer to a "previous state of a mem structure"? the code is being executed inside a custom sqlite_exec function, which mirrors the behaviour of the provided sqlite_exec function, except for the conversions to/from text (which return the pointer to the internal data type, without any conversion in case of Integer and float value), WAY before calling the statement finalization. The callback itself that must be provided to sql3_exec, is called before finalizing and freeing the memory allocated by the statement execution. > I do hope you are statically linking your modified SQLite code to your > application and not installing it as a shared image. In the latter case, > some unsuspecting application might inadvertently stumble across your > version of the interface, which no longer conforms to the published > interface, and cause failures there. That would probably cause some really > rave reviews. "I just installed XXX on my phone and it died" is not a > viable recommendation. > Again good point, but we are 'lucky' here as well... sqlite amalgamation is compiled into the application and not dinamically linked. I know that what I am doing is hazardous, rings alarm bells, 'dirty' (if you want to call it dirty), I know that it could be done in several different ways, it's just that sqlite3_exec does already an excellent job in wrapping prapared statemets, stepping and finalizing the statement itself; it does return an array of stringified column values, and that's what I would like to avoid (in the case of integers and floats). Looking forward to reading more insights, hints, tips, suggestions, arguments, advice, whatever. Thanks ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions
Thanks for the response you gave me the kind of answer I was looking for. a) you are circumventing the intended interface > I know, I am aware, and that is exactly what I wanted to do; I Wrote to the mailing list to decide if I will really do it or not > b) you are breaking encapsulation, because columnMem returns a pointer to > an internal type, which is useless to you, unless you have made public all > the SQLite internals > The change is not indended for distribution, it is going to be a change in our codebase. The Internal type goes down to a struct having a union at its base address, such union contains either a double or a int64 value, for REAL and INTEGER values respectively (unless I wrongly assume that the internal representation of Mem pointer, is, under specific circumstances (which I havent yet stumpled upon in testing), inconstistent with the type reported by sqlite3_column_type. This means that I have no need to make all SQLite internals public. As I said I already keep track of the base type of each column in my sqlite database; in all the testing performed, I never failed to identify the correct datatype for the pointer. I cannot exclude that there are situation where I will fail, and that is why I posted here. c) you are assuming that type conversion of a Mem type will conserve the > original fields > This is the real 'unknown' to me issue, could you please elaborate a little more? Which are the original fields you are referring to? Which type conversions of a mem type are you referring to? > d) you are duplicating work by calling sqlite3_column_text() twice, unless > the type is numerical > e) you are duplicating work by calling sqlite3_column_type() twice, > instead of using the value returned from the first call > True, will be fixed... I focused too much on trying out the consistency of the change and the lack of regressions. > f) you are using the result of a function call that is documented to be > undefined in the exact context you are using it in > Which is the function you are referring to? which context are you referring to? > IMHO, you would be much better off attempting to master the official > SQLite API. > Opinion acknowledged and taken in serious consideration, Nothing (except eventual development time constraints) prevents me from embracing your advice. Thanks again ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] sqlite3_exec without ubiqitous text conversions
Hello, I have been working on an application based on SQLite for 2 years now, and recently we started running some performance profiling to check if there are areas where we can squeeze some extra performance. SQlite query execution is used almost exclusively through sqlite3_exec, and the implementation of callbacks. One of the areas of interest we identified is that every column is always passed to the callback as a c-char-string. Then we need to convert that string back to the intended data type. Apparently a consistent amount of time is spent doing this. I started looking into SQLite code and realized that int and real values retrieved from database, is actually fecthed as binary value, then converted to string, passed to the callback, and in the callback we convert it back to its original data type (as we keep track of 'true' datatype for each column). I decided to try skipping these conversions, by creating a custom implementation of sqlite3_exec, which does the following in the inside 'step' loop: if (rc == SQLITE_ROW) { azVals = azCols; for (i = 0; i < nCol; i++) { azVals[i] = (char *)sqlite3_column_text(pStmt, i); int col_type = sqlite3_column_type(pStmt, i); switch (col_type) { case SQLITE_INTEGER: case SQLITE_FLOAT: { azVals[i] = (char*)columnMem(pStmt, i); columnMallocFailure(pStmt); break; } default: { azVals[i] = (char *)sqlite3_column_text(pStmt, i); break; } } if (!azVals[i] && sqlite3_column_type(pStmt, i) != SQLITE_NULL) { sqlite3OomFault(db); goto exec_out; } } } Instead of ALWAYS converting to text, in case of INTEGER or FLOAT columns types, I fetch the address of the data through columnMem, put it in the azVals array, and go on. The callback knows which is the expected data type for the column, reads and copies data in the destination variable through a simple assignment. I am aware that forcing a 'typed' pointer into a generic char* pointer (azVals[x]), rings a lot of alarm bells; I am, as well, aware that I could avoid sqlite3_exec, and call sqlite3_step myself. As of now the application is running smoothly and faster as far as satabase access is concerned. I am writing here to get opinions about other potential pitfalls or oversights in this approach, as well as reason why I should not proceed on this path. Thanks in advance ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users