I don't think that patch to apply affinities is a good idea since it will 
usually be mostly useless and will negatively impact performance since one can 
and should assume that the actual author of the VTable knows what they are 
doing, currently SQLite3 does not enforce declared VTable column affinities and 
it should stay that way.  That said, although the definitions of the column 
affinities are stored in the internal schema, it is not trivial to access them 
from a non-builtin extension.

Instead I have once again changed the vsv.c VTable plugin so that it does some 
more useful things.

The below parameters work exactly as they did in the csv.c extension with 
regards to setting the functionality with the single exception that now if you 
specify all of header, schema and columns, the header row will still be 
skipped.  In csv.c if you specified both schema and columns then the header row 
was not skipped even if you specified it existed.

filename=FILENAME is the csv filename in quotes.
data=DATA         provides the data inline.
header=BOOL       will "parse" the first row to get the column names unless a 
SCHEMA is specified.
schema=STRING     allows the specification of the schema to use
columns=N         number of columns in VTable


New parameters:


skip=N 

will skip the specified number of rows at the start of the CSV file, exclusive 
of the header row (ie, skip data rows). If the header, via the header option, 
is specified to exist then it will always be skipped. 


fsep=SEP and rsep=SEP

allow to specify the particular separators to be used.  The defaults are 
fsep=',' and rsep='\n' so the default is the same as the original.  The 
separator can be any single character, or it can be a supported escape code (\t 
\v \n \f for horizontal-tab, vertical-tab, newline and form-feed), or it can be 
a hexadecimal coded byte as in '\x1f'.  Multibyte separators cannot be 
specified.


affinity=AFF 

where the affinity can be NONE, BLOB, TEXT, INTEGER, REAL, NUMERIC with the 
default being NONE.  NONE does nothing different from the existing csv.c VTable 
(unless you also specify validatetext).  However, if you specify an affinity it 
applies to *every* returned value and makes the following changes:
 - BLOB returns all data as BLOB
 - TEXT returns all data as TEXT
 - INTEGER returns anything that looks like an integer as an INTEGER otherwise 
as TEXT
 - REAL returns anything that looks like a number as a REAL (double) otherwise 
as TEXT
 - NUMERIC returns anything that looks like a number as either an INTEGER or 
REAL otherwise as TEXT
   - if the field "looks like an integer" then an INTEGER is returned
   - if the field "looks like a number" then a REAL is returned unless it can 
be converted to an integer
"looks like an integer" effectively means passing the following regex "^ 
*[+-]?\d+ *$"
"looks like a number" effectively means passing the following regex "^ 
*[+-]?(\d+\.?\d*|\d*\.?\d+)([eE][+-]?\d+)? *$"

Processing is limited by the runtime C library so the detection is not quite so 
good as the affinities built into the SQLite3 core.  This means that something 
that looks like an integer may fail to load as an integer because it is too big 
(918273745985736365575984857636253857564363 looks like an integer and the 
SQLite3 core will recognize this and load it as a double.  The C library is not 
so smart and will probably simply return MAX_INT).

Processing is also limited by the compiler.  Some compilers (wave at Microsoft) 
do not treat long double as being more precise than a regular double (ie, 10 or 
16 bytes rather than 8) and treat that simply as a plain old double (some 
compilers, like Intel, need a compilation option to enable long doubles being 
actually longer than a regular double).  It is detected if this is the case 
(sizeof(long dounble)==sizeof(double)) so that only "integers" that can be 
entirely contained in the mantissa are converted to integer, and others will 
remain as double.


validatetext=BOOL

will cause TEXT fields to be validated as containing a valid UTF-8 coding 
sequence (no content check is performed, only the validity of the encoding is 
checked).  If this is turned on together with any type of affinity (ie, other 
than none) then improperly encoded UTF8 text will be returned as a BLOB.  If no 
affinity is in effect (ie, affinity=none) then an error will be thrown rather 
then letting a text field contain invalid data.  This means that fields 
containing embedded nulls (rather than encoded 0 bytes) will not be able to be 
stored as TEXT and will either be BLOB type or throw an invalid encoding error.


nulls=BOOL

when enabled empty fields will be returned as NULL rather than empty strings.  
An empty field means a field with no content (separators are side-by-each).  
Specifically empty strings are not changed.  That is, the middle column in this 
a,,b is a null column but this is not a,"",b

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On
>Behalf Of Hick Gunter
>Sent: Thursday, 6 February, 2020 23:47
>To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
>Subject: Re: [sqlite] [EXTERNAL] Re: Patch: VTable Column Affinity
>Question and Change Request
>
>>Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
>Im Auftrag von Dominique Devienne
>>
>>On Thu, Feb 6, 2020 at 9:32 AM Hick Gunter <h...@scigames.at> wrote:
>>> >Of course, it may be that the writer of the VTable should know what
>they are doing and generate a VTable definition that is consistent with
>how their cursor methods return data, however ... this will omit the
>OP_Affinity if no >>column type was specified when the VTable was defined
>and most of the VTable declarations in the existing code that I looked at
>do not specify column affinities in the declarations.
>>>
>>> Very nice. but detrimental for our use case. Please refrain from
>adding this tot he distribution by default.
>>> We are almost exclusively using virtual tables to allow queries
>against our internal data sources, which are C language structs and thus
>strictly typed. The column affinities provided by the VTab
>implementations are used for documentation purposes and the xColumn
>methods always return the same type (calling the "wrong" sqlite3_result
>function is considered a programming error). Coercing the returned value
>to the same type would be just a waste of memory and CPU cycles.
>>
>>+1. I fear what it would do to our app, also making extensive use of
>vtables.
>>Like Gunter mentions, the type is there more for documentation, I'm
>unsure "what havoc this could wreak".
>>
>>Note that our vtables are all read-only, if that matters here. It's
>unclear to me if the above applies to writes only, or also applies to
>reads. If to writes only, then I don't care much at the moment, although
>I might in the future, and would likely prefer seeing the raw value in my
>code, than the result of affinity-coercion. --DD
>
>The patch affects the value returned from the VTable implementation if
>affinities are provided by the create table statement from the xCreate
>method
>
>
>___________________________________________
> Gunter Hick | Software Engineer | Scientific Games International GmbH |
>Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 |
>(O) +43 1 80100 - 0
>
>May be privileged. May be confidential. Please delete if not the
>addressee.
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to