Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions

2019-07-31 Thread Barone Ashura
Thanks!

this helped me a lot in understanding the red flags.
I think I have read the 'type affinity' documentation pages quite a few
times (including this morning).

Il giorno mer 31 lug 2019 alle ore 12:05 Olivier Mascia  ha
scritto:

> > Le 31 juil. 2019 à 10:03, Barone Ashura  a
> écrit :
> >
> > CREATE TABLE "VALUES" (
> > id INTEGER PRIMARY KEY ASC,
> > value_int INTEGER,
> > value_double REAL,
> > value_text TEXT
> > );
> >
> > and that I want to execute the following simple query:
> >
> > SELECT * FROM "VALUES";
> >
> > For this very specific query, are there circumstances where the call
> > to sqlite3_column_type, for column 'value_int', returns a result
> different
> > from
> > SQLITE_INTEGER,
>
> Yes.
> Nothing stops any piece of your own programming or anyone using the CLI to
> do:
>
> INSERT INTO "VALUES"(value_int) VALUES('something');
>
>
> The sqlite3_column_type for the column value_int will tell you SQLITE_TEXT
> because:
>
> See: https://www.sqlite.org/c3ref/column_blob.html
>
> "The sqlite3_column_type() routine returns the datatype code for the
> initial data type of the result column."
>
> The "initial data type" is NOT the one declared in the schema. It is the
> type of the actual stored value in that particular column of that
> particular row.  SQLite is mostly type agnostic regarding storage.  The
> declared type in the schema only serves as a guideline for some automatic
> conversion where applicable.  SQLite calls this: 'affinity'.
>
> See: https://www.sqlite.org/datatype3.html#type_affinity
>
> INSERT INTO "VALUES"(value_int) VALUES(123);
> and
> INSERT INTO "VALUES"(value_int) VALUES('123');
>
> will both actually stores a SQLITE_INTEGER because the column has a type
> whose name starts with INT and the string '123' can safely be converted
> from ascii to integer and reciprocal.
>
> But INSERT INTO "VALUES"(value_int) VALUES('something');
> will actually store the string 'something' because it can't be converted
> back and forth between text and integer.
>
> —
> Best Regards, Meilleures salutations, Met vriendelijke groeten, Mit besten
> Grüßen,
> Olivier Mascia
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_exec without ubiqitous text conversions

2019-07-31 Thread Barone Ashura
Suggestion Acknowledged :)

Il giorno mar 30 lug 2019 alle ore 18:56  ha scritto:

> Hi!
>
> I use a simple trick:
>
> A clone of the sqlite3_exec that passes the sqlite3_stmt as an argument
> to the callback function.
>
> So no conversion is made and we can use the sqlite3_column... functions
> directly on the retrieved row.
>
> Happy coding!
>
>
> On 2019-07-30 05:00, sqlite-users-requ...@mailinglists.sqlite.org wrote:
> > Send sqlite-users mailing list submissions to
> >   sqlite-users@mailinglists.sqlite.org
> >
> >    6. sqlite3_exec without ubiqitous text conversions (Barone Ashura)
> >
> >
> > ------
> >
> > Message: 6
> > Date: Tue, 30 Jul 2019 13:13:52 +0200
> > From: Barone Ashura 
> > To: sqlite-users@mailinglists.sqlite.org
> > Subject: [sqlite] sqlite3_exec without ubiqitous text conversions
> > Message-ID:
> >   <
> cad3lth+nfostqjf50dgsdzuts0exvk7n89amtsjpygx_6b+...@mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > Hello,
> >
> > I have been working on an application based on SQLite for 2 years now,
> > and
> > recently we started running some performance profiling to check if
> > there
> > are areas where we can squeeze some extra performance.
> >
> > SQlite query execution is used almost exclusively through sqlite3_exec,
> > and
> > the implementation of callbacks.
> >
> > One of the areas of interest we identified is that every column is
> > always
> > passed to the callback as a c-char-string. Then we need to convert that
> > string back to the intended data type. Apparently a consistent amount
> > of
> > time is spent doing this.
> >
> > I started looking into SQLite code and realized that int and real
> > values
> > retrieved from database, is actually fecthed as binary value, then
> > converted to string, passed to the callback, and in the callback we
> > convert
> > it back to its original data type (as we keep track of 'true' datatype
> > for
> > each column).
> >
> > I decided to try skipping these conversions, by creating a custom
> > implementation of sqlite3_exec, which does the following in the inside
> > 'step' loop:
> >
> > if (rc == SQLITE_ROW) {
> > azVals = azCols;
> > for (i = 0; i < nCol; i++) {
> > azVals[i] = (char *)sqlite3_column_text(pStmt,
> > i);
> > int col_type = sqlite3_column_type(pStmt, i);
> > switch (col_type)
> > {
> > case SQLITE_INTEGER:
> > case SQLITE_FLOAT:
> > {
> > azVals[i] = (char*)columnMem(pStmt, i);
> > columnMallocFailure(pStmt);
> > break;
> > }
> > default:
> > {
> > azVals[i] = (char
> > *)sqlite3_column_text(pStmt, i);
> > break;
> > }
> > }
> > if (!azVals[i] && sqlite3_column_type(pStmt, i)
> > !=
> > SQLITE_NULL) {
> > sqlite3OomFault(db);
> > goto exec_out;
> > }
> > }
> > }
> >
> > Instead of ALWAYS converting to text, in case of INTEGER or FLOAT
> > columns
> > types, I fetch the address of the data  through columnMem, put it in
> > the
> > azVals array, and go on. The callback knows which is the expected data
> > type
> > for the column, reads and copies data in the destination variable
> > through a
> > simple assignment.
> >
> > I am aware that forcing a 'typed' pointer into a generic char* pointer
> > (azVals[x]), rings a lot of alarm bells; I am, as well, aware that
> > I
> > could avoid sqlite3_exec, and call sqlite3_step myself.
> > As of now the application is running smoothly and faster as far as
> > satabase
> > access is concerned.
> >
> > I am writing here to get opinions about other potential pitfalls or
> > oversights in this approach, as well as reason why I should not proceed
> > on
> > this path.
> >
> > Thanks in advance
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions

2019-07-31 Thread Barone Ashura
> Why do it this way?



Why not write your own custom_sqlite3_exec(...) that uses the standard,
> stable, documented interfaces?


Because sometimes I take wrong turns despite my best intentions :)
But I usually smell it, when I am taking wrong turns, and that is why I can
up here asking
for advice, insights and enlightment, so I can u-turn, go back at the
crossroads, and take another direction :)



> custom_sqlite3_exec(...) could call prepare / step / finalize, and use the
> standard sqlite3_column_* interfaces to fill a result array. This would be
> very little work and could be a drop in replacement of sqlite3_exec. As a
> bonus, you can have a proper signature of void* for your callback function
> (since it's no longer receiving an array of strings, char** is a lie which
> will confuse anyone working on the project in future). Another
> maintainability bonus: The name will alert any consumers that this isn't
> part of the standard sqlite3 interface.
>

defining a new callback was one of the option for further development.
The different typed callback would certainly be void* for the reason you
mentioned.


> An assumption you've made may be incorrect: You say that callers know what
> data type to expect so that they know how to cast the results (I assume
> this is based on the declared column affinity?), but SQLite is not a
> strictly typed database. So the actual data might not match what they're
> expecting; by going the route you've chosen the exact way you pick up these
> errors might be quite far from the source of the error.
>

This is the thing that I need to get a perfect hold of. When could it
happen that
'data might not match what they're expecting'. A small description of the
application context and
where and how SQLite kicks in. The usage of SQLite is completely wrapped
into functions specific for
each query being designed and executed, and data from query is returned in
typed structures to the callers.
Callers are ALWAYS local to the application domain, and there is no
possibility (except in the case of poor
programming and security, which can always be) for the users (either
application users or 'library' users)
to try and execute a custom written SQL statement. The make a long story
short, I am developing the full stack of
datatypes (C structs and/or c++ classes), functions, queries and their
middle function calls that define which
services are available to access data contained in our SQLITE database.

I will try to place here a simple example, supposing we have a simple table
as the following:


CREATE TABLE VALUES (

id INTEGER PRIMARY KEY ASC,

value_int INTEGER,

value_double REAL,

value_text TEXT,

);


and that I want to execute the following simple query:

SELECT * FROM VALUES;

For this very specific query, are there circumstances where the call
to sqlite3_column_type, for column 'value_int', returns a result different
from
SQLITE_INTEGER, and/or the union located at the base address of Mem
returned by columnMem, is being filled as double or even worse, the whole
Mem structure is filled as 'TEXT'? (We can rotate the question for the 3
value_XXX fields).

Usage of SQLITE functions or casts in the queried results could very likely
tamper with the 'expected' datatype. Is this one case
case where the assumption is incorrect?


> A sane improvement to the interface would be for the consumer of
> custom_sqlite3_exec(...) to pass in an array of types that they are
> expecting; checking that the types match inside of your custom function
> would be fairly trivial (and performant) and you could have properly
> defined behaviour for what happens if the types from the database don't
> match the exepcted types. Some ideas are: Use SQLite type coercion (just
> call sqlite3_int even if it's a string); fail; or skip the record.
>

Suggestion Acknowledged :)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions

2019-07-30 Thread Barone Ashura
I really wish to keep the argument polite and constructive, so please dont
get me wrong if I reply, I just want to understand what you are referring
to, realize and evaluate which is the best way to go for me, not for the
sqlite library, that's why I writing to the sqlite library.

Il giorno mar 30 lug 2019 alle ore 15:50 Hick Gunter  ha
scritto:

> f) There are exactly 2 documented functions in your code. Did you not read
> their documentation???
>
> See https://sqlite.org/c3ref/column_blob.html
>
> " After a type conversion, the result of calling sqlite3_column_type() is
> undefined, though harmless. Future versions of SQLite may change the
> behavior of sqlite3_column_type() following a type conversion."
>

which type conversion are you referring to? the second one erroneously
called in the example code? I do understand the 'type conversion' to be one
of the six functions listed in the page you linked (which was read).
The statement before the one you quoted specifically says:  "The return
value of sqlite3_column_type() can be used to decide which of the first six
interface should be used to extract the column value". Isnt this exactly
what I want to be doing? Read the column type from the statement, according
to the return value call the relevant extraction function. The
documentation describes 'automatic conversions' being performed if I am
trying to extract a datatype that is different from the internal datatype.
But this is exactly what I want to stay away from, unless, of course I am
missing something, which I would very like understand.


> b) Breaking encapsulation includes referencing internal .h files in your
> own code. You should be using sqlite3.h *only*
>

I am not referencing ANY internal.h file in my own code. I am just using
the amalgamation distribution, and I am writing code in sqlite.c, not in my
own source files, so nobody outside of sqlite.c calls any function or uses
any datatype that is not declared in sqlite.h.



> You do realise that declared types are not enforced in SQLite? If somehow
> a real with a non-integer value got inserted into an integer field, the
> type of the returned value would still be real, but you would be expecting
> integer. And silently reading the bits of the double value as an integer,
> which would probalby cause a practically untetectable error.
>

True, agreed. This is the reason why the types of each value being inserted
into, updated and read from the database is strictly enforced outside of
sqlite. To state in a cear way: SQLite might not enforce declared types, by
the application around sqlite has been specifically designed to enforce
different types.


>
> c) There is no guarantee that the internal fields referring to a previous
> state of a Mem structure are preserved across conversions.
>

why do you refer to a "previous state of a mem structure"? the code is
being executed inside a custom sqlite_exec function, which mirrors the
behaviour of the provided sqlite_exec function, except for the conversions
to/from text (which return the pointer to the internal data type, without
any conversion in case of Integer and float value), WAY before calling the
statement finalization. The callback itself that must be provided to
sql3_exec, is called before finalizing and freeing the memory allocated by
the statement execution.


> I do hope you are statically linking your modified SQLite code to your
> application and not installing it as a shared image. In the latter case,
> some unsuspecting application might inadvertently stumble across your
> version of the interface, which no longer conforms to the published
> interface, and cause failures there. That would probably cause some really
> rave reviews. "I just installed XXX on my phone and it died" is not a
> viable recommendation.
>

Again good point, but we are 'lucky' here as well... sqlite amalgamation is
compiled into the application and not dinamically linked.

I know that what I am doing is hazardous, rings alarm bells, 'dirty' (if
you want to call it dirty), I know that it could be done in several
different ways, it's just that sqlite3_exec does already an excellent job
in wrapping prapared statemets, stepping and finalizing the statement
itself; it does return an array of stringified column values, and that's
what I would like to avoid (in the case of integers and floats).

Looking forward to reading more insights, hints, tips, suggestions,
arguments, advice, whatever.

Thanks
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] sqlite3_exec without ubiqitous text conversions

2019-07-30 Thread Barone Ashura
Thanks for the response
you gave me the kind of answer I was looking for.


a) you are circumventing the intended interface
>

I know, I am aware, and that is exactly what I wanted to do; I Wrote to the
mailing list to decide if I will really do it or not


> b) you are breaking encapsulation, because columnMem returns a pointer to
> an internal type, which is useless to you, unless you have made public all
> the SQLite internals
>

The change is not indended for distribution, it is going to be a change in
our codebase. The Internal type goes down to a struct having a union at its
base address, such union contains either a double or a int64 value, for
REAL and INTEGER values respectively (unless I wrongly assume that the
internal representation of Mem pointer, is, under specific circumstances
(which I havent yet stumpled upon in testing), inconstistent with the type
reported by sqlite3_column_type. This means that I have no need to make all
SQLite internals public. As I said I already keep track of the base type of
each column in my sqlite database; in all the testing performed, I never
failed to identify the correct datatype for the pointer. I cannot exclude
that there are situation where I will fail, and that is why I posted here.

c) you are assuming that type conversion of a Mem type will conserve the
> original fields
>

This is the real 'unknown' to me issue, could you please elaborate a little
more? Which are the original fields you are referring to? Which type
conversions of a mem type are you referring to?


> d) you are duplicating work by calling sqlite3_column_text() twice, unless
> the type is numerical
> e) you are duplicating work by calling sqlite3_column_type() twice,
> instead of using the value returned from the first call
>

True, will be fixed... I focused too much on trying out the consistency of
the change and the lack of regressions.


> f) you are using the result of a function call that is documented to be
> undefined in the exact context you are using it in
>

 Which is the function you are referring to? which context are you
referring to?


> IMHO, you would be much better off attempting to master the official
> SQLite API.
>

Opinion acknowledged and taken in serious consideration, Nothing (except
eventual development time constraints) prevents me from embracing your
advice.

Thanks again
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3_exec without ubiqitous text conversions

2019-07-30 Thread Barone Ashura
Hello,

I have been working on an application based on SQLite for 2 years now, and
recently we started running some performance profiling to check if there
are areas where we can squeeze some extra performance.

SQlite query execution is used almost exclusively through sqlite3_exec, and
the implementation of callbacks.

One of the areas of interest we identified is that every column is always
passed to the callback as a c-char-string. Then we need to convert that
string back to the intended data type. Apparently a consistent amount of
time is spent doing this.

I started looking into SQLite code and realized that int and real values
retrieved from database, is actually fecthed as binary value, then
converted to string, passed to the callback, and in the callback we convert
it back to its original data type (as we keep track of 'true' datatype for
each column).

I decided to try skipping these conversions, by creating a custom
implementation of sqlite3_exec, which does the following in the inside
'step' loop:

if (rc == SQLITE_ROW) {
azVals = azCols;
for (i = 0; i < nCol; i++) {
azVals[i] = (char *)sqlite3_column_text(pStmt, i);
int col_type = sqlite3_column_type(pStmt, i);
switch (col_type)
{
case SQLITE_INTEGER:
case SQLITE_FLOAT:
{
azVals[i] = (char*)columnMem(pStmt, i);
columnMallocFailure(pStmt);
break;
}
default:
{
azVals[i] = (char
*)sqlite3_column_text(pStmt, i);
break;
}
}
if (!azVals[i] && sqlite3_column_type(pStmt, i) !=
SQLITE_NULL) {
sqlite3OomFault(db);
goto exec_out;
}
}
}

Instead of ALWAYS converting to text, in case of INTEGER or FLOAT columns
types, I fetch the address of the data  through columnMem, put it in the
azVals array, and go on. The callback knows which is the expected data type
for the column, reads and copies data in the destination variable through a
simple assignment.

I am aware that forcing a 'typed' pointer into a generic char* pointer
(azVals[x]), rings a lot of alarm bells; I am, as well, aware that I
could avoid sqlite3_exec, and call sqlite3_step myself.
As of now the application is running smoothly and faster as far as satabase
access is concerned.

I am writing here to get opinions about other potential pitfalls or
oversights in this approach, as well as reason why I should not proceed on
this path.

Thanks in advance
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users