[sqlite] Podcast with Dr Hipp: SQLite history, success and funding

2016-05-16 Thread Eric Rubin-Smith
On Sat, May 14, 2016 at 4:17 PM, Simon Slavin  wrote:

> Those interested in SQLite might like to listen to
>
> 
>

Having spent many years working directly in Richard's code and that of many
hundreds of other programmers, I consider Richard to be by far the best
programmer I have ever come across.  So it is always a pleasure to listen
to him talk about the craft.

I think the world would benefit from Richard spending more time on the
airwaves discussing particular design decisions he has made through the
years, lessons learned, etc.  Some qualified interviewer should get into
the weeds with him (perhaps on SQLite or perhaps some other thing Richard
thinks would be instructive) and try to download as much of Richard's
substantive thought on programming as possible.

Richard: when your contract is up in 2050 maybe you can write us a book.
Sort of a "War As I Knew It" but for programming.  :-)

And if any of you know of other programmer role models of Richard's
caliber, please do share pointers!

Eric


[sqlite] Encrypt the SQL query

2016-02-27 Thread Eric Rubin-Smith

>  at some
> point the encrypted SQL wiill have to be decrypted before SQLite
> interprets it.  

Perhaps we could achieve some level of obfuscation by "pre-preparing" at 
compile time the set of all statements that the program uses, and storing the 
SQLite prepared statement objects in the program image. Then you'd just do 
argument binding at run time.

That is, you are more-or-less storing the query plan instead of the SQL. 

I guess you'd have to make a bunch of assumptions about the db schema, the 
boundedness of your universe of queries and the like, and perhaps have to hack 
a few values into some relevant structs -- I'm not enough of an expert to know 
the extent of it. 

Would something like that be workable?

Again, even if it is workable, it won't stop a sufficiently motivated and 
skilled reverse engineer.  There are people employed at e.g. Symantec whose 
entire job is to figure out what a program does, despite the efforts of 
world-class adversaries to obfuscate it.  They would probably think my above 
suggestion is adorable, and crack it during a donut break.

In order to evaluate suggestions coherently, we'd really need to know what 
threat you are trying to mitigate. 

Eric


[sqlite] determining is-leap-year in sqlite

2016-02-18 Thread Eric Rubin-Smith
>
>
>   select 2, 28 + (CAST(strftime("%j", c.year||"-12-31") AS INTEGER) % 365)
>

Here you assume that all years have either 365 or 366 days.  Would that it
were so!

Look at the year 1752 -- you may notice something odd happened that
September. :-)

Eric


[sqlite] 10 minute Avg

2015-11-19 Thread Eric Rubin-Smith
On Thu, Nov 19, 2015 at 11:26 AM, Simon Slavin  wrote:

>
> On 19 Nov 2015, at 4:06pm, Andrew Stewart 
> wrote:
>
> >Would storing the dateTime as Integer make the database and
> indexes smaller?
>
> Yes.  An integer will take a maximum of 8 bytes to store.  A string's
> storage needs at least one byte per character in the string.  And the
> format you're using for date/time stamps is longer than 8 characters long.
>

To be pedantic, an integer can take up to 9 bytes.
https://www.sqlite.org/fileformat.html#varint

Eric


[sqlite] Schema-less JSON SQLite DB?

2015-07-15 Thread Eric Rubin-Smith
Given the number of the threads in the list, it seems a

> lot of people want hierarchical data in SQLite :-)
>

Throwing in another $.02 here, my company had a need to ingest JSON of
various formats into SQLite databases in low-power embedded devices.  We
wrote the utility program described here:
https://www.aterlo.com/wp-content/uploads/2015/04/json2sqlite.pdf

The program works well as long as your JSON's depth is bounded. I would not
claim that it supports schemaless data models, but we would love to hear
suggestions for design improvements.

We would be happy to clean this up for open source publication if there is
enough interest.

Eric


[sqlite] json_* functions in sqlite

2015-04-22 Thread Eric Rubin-Smith
On Wed, Apr 22, 2015 at 2:17 PM, Eric Rubin-Smith  wrote:

>
> On Tue, Apr 21, 2015 at 9:09 PM, Ashish Sharma 
> wrote:
>
>> Many times I store JSON data in sqlite. It will be useful if sqlite came
>> with functions which understand JSON. Presto has a nice set
>> https://prestodb.io/docs/current/functions/json.html
>>
>
> I wrote a little tool that more or less allows the user to ingest JSON
> into an SQLite database, for use in tiny platforms (think a VM running
> inside a low-power home router).  I was considering open-sourcing it.
>
> Below is the man page for the tool.  If there is enough interest, I'll do
> the requisite work to remove dependencies on other bits of my
> infrastructure and publish it somewhere.  Let me know.
>

Apologies for self-replying.  The formatting of my man page copy-paste
looks pretty bad in a lot of places.  Here's a PDF-ized version for easier
reading: https://www.aterlo.com/wp-content/uploads/2015/04/json2sqlite.pdf

Eric


[sqlite] json_* functions in sqlite

2015-04-22 Thread Eric Rubin-Smith
On Tue, Apr 21, 2015 at 9:09 PM, Ashish Sharma 
wrote:

> Hi
>
> Many times I store JSON data in sqlite. It will be useful if sqlite came
> with functions which understand JSON. Presto has a nice set
> https://prestodb.io/docs/current/functions/json.html
>

I wrote a little tool that more or less allows the user to ingest JSON into
an SQLite database, for use in tiny platforms (think a VM running inside a
low-power home router).  I was considering open-sourcing it.

Below is the man page for the tool.  If there is enough interest, I'll do
the requisite work to remove dependencies on other bits of my
infrastructure and publish it somewhere.  Let me know.

Eric

json2sqlite(1)   Aterlo Networks, Inc.
json2sqlite(1)

NAME
   json2sqlite - convert JSON to SQLite database rows

SYNOPSIS
   json2sqlite [OPTION]...

DESCRIPTION
   Read  arbitrary  JSON  from  a  file  (or standard input) and emit
some
   SQLite rows.  The form the rows take is dictated by the options.

   The entire run of the program is wrapped in a single SQLite
transaction
   that is only committed if we do not encounter any critical errors.

   This  program's  original  intent  was  to  be  run on small,
low-power
   devices.  It can be provided a strict memory cap with the -M flag.

   Input selection
   -i, --input-file=FILE
  Read JSON from FILE.  If FILE is '-', then  read  from
standard
  input.  If no value is passed, then '-' is assumed.

  If  you  want to read from a file whose literal name is '-',
you
  can qualify the path to the file, e.g. './-'.

   -P, --input-path=JSON_PATH
  Specify the path of the object you wish to import.

  The root-level JSON element has the  path  '.'.   If  the
root-
  level element is an object, then JSON_PATH may contain a
hierar-
  chy of key names at which the input array  can  be  found.
For
  example,  if  we have JSON {"a": [  ]} then the array
con-
  taining  can be imported by saying '--input-path .a'.
The
  leading  dot  may  be omitted.  (This leading-dot syntax is
just
  provided to make addressing the root-level object look a
little
  nicer on the command-line).

  If  left  unspecified,  then  '.'  (i.e.  the whole document)
is
  assumed.

   -T, --input-type=[array|object]
  If 'array' is given, then we import the JSON array that  can
be
  found at the JSON_PATH specified by --input-path.

  When importing an array, it is assumed that each of the
elements
  of the array is itself a JSON object.  These objects may
either
  be  a  simple  name->scalar  mapping, or they may have some
more
  complex nested form.  The objects are usually either imported
as
  one row per object, or one row per name/value pair, depending
on
  the other arguments to this program.

  Otherwise, if 'object' is  given,  then  import  a  single
JSON
  object  whose  path  is the JSON_PATH specified by
--input-path.
  The object is imported just as if it were the unique element
in
  a  containing  array and the array had been specified for
import
  via --input-type=array.

  If left unspecified, then 'array' is assumed.

   Output Selection
   -o, --output-database=DATABASE_FILENAME
  Emit rows to the database file DATABASE_FILENAME.  This
argument
  is mandatory.

   -t, --output-table=TABLE_NAME
  Emit rows to the given TABLE_NAME.  This argument is
mandatory.

   Initialization
   -s, --schema-file=SCHEMA_FILE
  Run  the  SQL  statements  in  SCHEMA_FILE before doing
anything
  else.

   -D, --delete-first
  Run 'DELETE FROM ;' before ingesting  any
records
  (but still within the process's global transaction, so that
sub-
  sequent failures will not lead to data loss).

   Transforming input to output
   Note that the input selection mechanism you chose above  with
--input-
   type  and  --input-path  has given us a list of objects (where the
list
   has length 0, 1, or more).  This is called the 'initial result set'
in
   the  below.   The following options instruct json2sqlite how to
convert
   each of those objects to some list of table rows.  There are two
over-
   all  structures  that  each  object  can have (nested or flat), and
two
   overall approaches for  converting  those  structures  to  SQLite
rows
   (either one row per name-value pair or one row per object).

   Input structure specification

   You  must  indicate  to json2sqlite the structure of the objects in
the
   initial result set.  If you don't specify one, then --flat is
assumed.

   

Re: [sqlite] write internal blob vs external file

2014-09-15 Thread Eric Rubin-Smith
Dominique Devienne wrote: 

> > Looking at the sqlite web site and mailing lists shows that the SQLite 
> team 
> > has taken a stab at answering the question, "is it faster to read a blob 
> > out of sqlite or out of a file?".  See the links below.  
> > 
> > Does the team have analogous guidance regarding write speeds?  
> > 
> 
> Depends if you care only about the initial write, or also about 
> incremental writes.  

Thanks to those of you who responded already!  I'll give a little 
more flavor for those who are interested.  

My particular application is a cache of large, immutable files.  The 
UPDATE case doesn't apply to me for this particular application.  

I already have a system on top of SQLite's blobs in which I store the 
file data in chunks, one chunk per row.  (I'm downloading the files from
the network, and at the moment am using a version of the Python SQLite 
API that does not have hooks to the incremental blob I/O.)  So I 
download a few megabytes, start the next hunk download in the 
background (using the "requests-futures" Python library, which is 
really sexy BTW), write out the current hunk, commit, continue.  The 
row also has the byte range that it is storing: 

  CREATE TABLE IF NOT EXISTS file( 
  fileID TEXT NOT NULL, 
  start  INTEGER NOT NULL,  -- offset of first byte of chunk 
  endINTEGER NOT NULL,  -- offset of last byte of chunk 
  bytes  BLOB NOT NULL, 

  CONSTRAINT BlobSize CHECK(length(bytes) == (end-start+1))
  ); 

All requests are for retrieval of a particular range of bytes for a 
particular file.  So I have an index: 

  CREATE INDEX IF NOT EXISTS FileIdxNameStart ON file(
  fileID, 
  start
  ); 

and I create my own little "query plan" in application logic for 
servicing a particular request on the read side.  The read side is 
fast enough for my purposes.  

Inserts are slow and deletes are slow.  The disk I am using is capable 
of writing about 6.5MB/sec, but my app + SQLite only hit around 
600KB-1MB/sec under this scheme, depending on some variables.  This is 
approximately half of the overall speed that I expect (because if I use 
'wget' to just write out a downloaded file to disk, I get about 
2MB/sec sustained over hundreds of MB).  I've played with the chunk 
size, the synchronous pragma, the commit interval, the page size pragma 
and the cache size pragma.  

I have not cleared my application of all blame yet, since I have not
taken enough measurements.  I'm considering prototyping a C
implementation that uses incremental blob I/O to write out the file
and is cleverer about the download state machine, but was hoping to 
know from this group if that is a fool's errand (e.g. if you all
have some particular reason for knowing that huge blobs in SQLite
are a priori a bad idea for performance reasons).

Deletes by fileID are really awful compared to a simple filesystem 
delete -- like 2 orders of magnitude worse by my eyeballing, at least
with the default 1024-byte page size.  Again, I think you'd expect it 
to be -- an unlink(2) only has to touch a few pages of disk, whereas 
an SQLite delete of 1GB worth of blob rows I guess goes and touches 
something every page_size bytes (right?).

So I'm thinking of just storing the data out of the database, perhaps
retaining an SQLite index.  But I'm worried about the safety issues that
arise from that.  E.g. I finish writing my file, I write an index 
entry and commit it, and immediately afterward I get a power failure.
Am I sure that the file bytes are correct?  And so on.  All of those
considerations are right in SQLite's wheelhouse -- I'd hate not to be
able to take advantage of it.


-- 
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] write internal blob vs external file

2014-09-12 Thread Eric Rubin-Smith
Looking at the sqlite web site and mailing lists shows that the SQLite team
has taken a stab at answering the question, "is it faster to read a blob
out of sqlite or out of a file?".  See the links below.

Does the team have analogous guidance regarding write speeds?

I'm also interested in the delete path.  It seems like SQLite doesn't have
much hope of competing with a native filesystem unlink(2) call to delete a
file that is many gigabytes long, for example.  Is that right?

Eric

References:

http://sqlite.1065341.n5.nabble.com/Internal-v-External-BLOBs-td15515.html
http://www.sqlite.org/intern-v-extern-blob.html
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Hexadecimal integer literals

2014-07-23 Thread Eric Rubin-Smith
So far no one has raised the idea of using a "big int" layer to implement
proper integer arithmetic past 64 bits.  The fact that it hasn't been
mentioned makes me worry that it's a blatantly silly idea for SQLite for
some reason -- but I'm tossing it out there on the off chance that it's
useful.

Personally, I would have loved to be able to use 128-bit integer math a few
weeks ago when I was investigating the idea of implementing a
longest-prefix search capability for IPv6 networks using SQLite's R*Tree.
 I had to implement all the shifting and such in C, which is fine, but it
would have been pretty sexy to have gotten it all done in SQL.

Eric
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] v3.8.4.3 download

2014-07-10 Thread Eric Rubin-Smith
Eric Rubin-Smith wrote:

> The docs don't get versioned like that AFAICT.  There's just one central
> set of docs.  If behavior changes from one version to the next, that is
> called out with text inside the documentation and in the change logs.
> Some of the more major changes have their own special pages in the docs.

Follow-up.  Looks like I may have been wrong about that?

Look at this link:

http://www.sqlite.org/docsrc/taglist

Poke around there and see if you get what you want.  Don't know if the
maintainers recommend using this or what, but in any case you can see 
the gory change history of the docs there if you want, including what
they did along the path to releasing version 3.8.4.3.

--
Eric A. Rubin-Smith
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] v3.8.4.3 download

2014-07-10 Thread Eric Rubin-Smith
Dave Wellman wrote:

> Where can I download older versions of sqlite3, specifically v3.8.4.3 (or at
> least v3.8.4.x). I don't need the source code, just the windows binaries.

It looks like we can hack the download URIs to get the version you want.  
Noting from the download page that the windows x86 binary is at

http://www.sqlite.org/2014/sqlite-shell-win32-x86-3080500.zip

we can try just altering the last few characters of that and get:

http://www.sqlite.org/2014/sqlite-shell-win32-x86-3080403.zip

which seems to work.

An alternative is to download the source for your version right out of
the version control system.  Here is the version change history page:

http://www.sqlite.org/changes.html

You can use the "SQLITE_SOURCE_ID" from that history page and stick it 
into a URL like this:

http://www.sqlite.org/src/info/

e.g. in your case 

http://www.sqlite.org/src/info/a611fa96c4a848614efe899130359c9f6fb889c3

>From there, you can click the "ZIP Archive" link to get the source code.

> I'll also need the docs. 

The docs don't get versioned like that AFAICT.  There's just one central
set of docs.  If behavior changes from one version to the next, that is
called out with text inside the documentation and in the change logs.
Some of the more major changes have their own special pages in the docs.

--
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] think I need better error-handling guidance in the C API

2014-07-06 Thread Eric Rubin-Smith
Klaas V wrote: 

> BTW (someone else wrote this): to call a program 'crappy' sounds a wee 
> bit megalomanic...  

:-) Are you denying that crappy programs exist in the world, or are you 
saying that they do exist but that stating something true about them 
is megalomaniacal?  Keeping in mind that I include most of the code 
I write among that which is crappy.  And that the very reason for my 
post was to avoid the very real danger that I am about to undertake the 
creation of more such crappy code (the result of which will no doubt be 
that this unfortunate forum will be subjected to more of my ridiculous 
questions).  

Also, I was trying to be funny.  Which I thought I was making pretty 
clear.  Apparently I need to add some of my jokes, and my writing 
ability in general, to the list of crappy things in the world.  

Even if I *were* a megalomaniac, and even if all those zillion programs 
*were* of perfect quality, that would *still* not be a refutation of my 
very narrow point, which is that the docs are lacking in the specific 
ways that I enumerated.  

If they are not lacking in those ways because I misread the docs or 
missed one of the pages outright, then I'm asking with apologies to be 
corrected and directed to what I missed or misread.  

But people familiar with the library mostly seem to agree with my 
material point.  

So!  Just looking for a ticket, so the relevant material can be added at 
the pleasure and convenience of the docs maintainers.  

-- 
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] think I need better error-handling guidance in the C API

2014-07-03 Thread Eric Rubin-Smith
RSmith wrote:

> I do hope your request is escalated to a ticket.

Yet a third person who completely agrees with me in every way! :-)

How 'bout it, SQLite devs?

> You seem well-versed in these matters, I believe there is a need for a
> documentation specialist in the sqlite dev team, apply online.  :)

Hehe well I'd be more than happy to do that, if I understood the API...

--
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] think I need better error-handling guidance in the C API

2014-07-03 Thread Eric Rubin-Smith
Stephan Beal wrote: 

> While i do fundamentally agree with your points, in practice it's not 
> as difficult as you seem to be making it out to be.  The rule is really 
> simple: any non-0 code is an error unless the APIs specify otherwise, 
> and the step() API does indeed document 2 non-error, non-0 error codes 
> (SQLITE_STEP and SQLITE_DONE).  Anywhere else in the API, non-0 means Bad 
> News.  There are relatively few cases (locking comes to mind) where an 
> error returned due to something other than a misuse of the API should be 
> considered recoverable (IMO).  

I don't think that answer is satisfactory.  Some errors are recoverable by 
restarting the transaction.  Others may be recoverable by creating a new 
connection.  Others may be recoverable by blowing away the database file 
and starting over (because the database file is corrupt).

SQLite makes an attempt to detect a great variety of such cases and 
return errors when they do happen.  I would like to know in which cases 
they will return such errors, and to which API calls, so that I can know 
how to handle those cases.  

I am not asking more of SQLite than I would ask of a Linux syscall man 
page.  In fact, I'm asking less -- I just want a ticket opened.  Beggars
can't be choosers :-) and I don't have any claim on the devs' time.

E.g. from open(2): 

ERRORS
   EACCES The requested access to the file is not allowed, or search  per-
  mission  is denied for one of the directories in the path prefix
  of pathname, or the file did not exist yet and write  access  to
  the  parent  directory  is  not allowed.  (See also path_resolu-
  tion(7).)

   EEXIST pathname already exists and O_CREAT and O_EXCL were used.

   EFAULT pathname points outside your accessible address space.

   EFBIG  See EOVERFLOW.

   EINTR  While blocked waiting to complete  an  open  of  a  slow  device
  (e.g.,  a FIFO; see fifo(7)), the call was interrupted by a sig-
  nal handler; see signal(7).
   {etc}

Analogous listings in the SQLite function API pages would cure the issue 
completely.  

> The fact that (literally) millions of applications get written despite 
> any perceived shortcomings in the documentation suggests that the docs are 
> at least "adequate," if not "perfect."  

Your argument is analogous to an argument that the Earth is the center 
of the universe because lots of people think so.  Nevermind what we 
actually observe with our own eyes.  

The docs are *not* adequate afaict for the purpose of determining in which 
situations certain error codes will be raised, or what the SQLite authors 
believe is appropriate subsequent client behavior.

A vague citation to a million anonymous programs of unknown quality is 
not a convincing reason to think otherwise.  And you and I both know 
that a random sample of 1 million programs will contain roughly 999000 
crappy ones.  :-) 

-- 
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] think I need better error-handling guidance in the C API

2014-07-03 Thread Eric Rubin-Smith
I respectfully re-raise my issue.  The "TLDR" is that the C API docs are
critically lacking in specificity in a relatively wide range of areas, as
exemplified below, making it difficult to write correct clients against the
library.

Do the SQLite authors disagree with my below points about the API docs for
sqlite3_step(), sqlite3_exec(), and sqlite3_prepare_v2()?  I think the
criticisms can be extended to other API calls.  If the authors do not
disagree then can we create a ticket against the docs?


On Mon, Jun 30, 2014 at 2:47 PM, Eric Rubin-Smith <eas@gmail.com> wrote:

> I've been using the sqlite Tcl API for about 3 million years, with much
> delight.  I'm venturing now into the C API -- hoping to put sqlite into a
> large monolithic "always on" C++ daemon.  I would like to use the API in a
> way perfectly in line with how the authors intended.
>
> But it's sort of hard to figure out exactly what the authors intended in
> some cases.
>
> Consider the man page for sqlite3_step() (
> http://sqlite.org/c3ref/step.html):
>
> "With the "v2" interface, any of the other result codes
> <http://sqlite.org/c3ref/c_abort.html> or extended result codes
> <http://sqlite.org/c3ref/c_abort_rollback.html> might be returned as
> well SQLITE_ERROR <http://sqlite.org/c3ref/c_abort.html> means that a
> run-time error (such as a constraint violation) has occurred.
> sqlite3_step() should not be called again on the VM. More information may
> be found by calling sqlite3_errmsg()
> <http://sqlite.org/c3ref/errcode.html>. With the legacy interface, a more
> specific error code (for example, SQLITE_INTERRUPT
> <http://sqlite.org/c3ref/c_abort.html>, SQLITE_SCHEMA
> <http://sqlite.org/c3ref/c_abort.html>, SQLITE_CORRUPT
> <http://sqlite.org/c3ref/c_abort.html>, and so forth) can be obtained by
> calling sqlite3_reset() <http://sqlite.org/c3ref/reset.html> on the prepared
> statement <http://sqlite.org/c3ref/stmt.html>. In the "v2" interface, the
> more specific error code is returned directly by sqlite3_step()."
>
> I am a big fan generally of the clarity of the sqlite docs.  But this page
> could use another section detailing all the possible result codes' specific
> semantics under sqlite3_step(), and in particular what the client should do
> in those circumstances.  Similar to how your normal linux system call man
> page details such things.
>
> The page bites a bit of that off, as you see above.  But consider e.g.
> (from http://sqlite.org/c3ref/c_abort.html):
>
> #define SQLITE_NOTICE  27   /* Notifications from sqlite3_log() */
>
> Am I supposed to look for that?  What am I supposed to do with it if I get
> it?  How about SQLITE_CANTOPEN?  Seems like that shouldn't be a possibility
> by the time we hit sqlite3_step(), but the docs are silent on the matter.
> So what am I to do?
>
> If I get SQLITE_IOERR, is that always permanent, or do I need to look for
> things like SQLITE_IOERR_LOCK and if that bit is set then rollback and
> retry?
>
>
> sqlite3_exec() has the same issue: it's hard to say from reading the docs
> exactly which of the error codes might be returned, and what the caller
> should do in those cases.
>
> sqlite3_prepare_v2() again has the same issue:
>
> "On success, the sqlite3_prepare() family of routines return SQLITE_OK
> <http://sqlite.org/c3ref/c_abort.html>; otherwise an error code
> <http://sqlite.org/c3ref/c_abort.html> is returned."
>
> Again, can *any* of those error codes be returned?  Which of them is
> permanent and which are temporary in the case of prepare_v2()?
>
> And so on.
>
> Apologies if I have missed some of the docs.  Any pointers, or especially
> example C code that the SQLite devs consider perfectly robust and complete
> API usage, are much appreciated.
>
> Eric
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] think I need better error-handling guidance in the C API

2014-07-01 Thread Eric Rubin-Smith
Keith Medcalf wrote: 

> IF ResultCode == A_OK YipeeKiAiii else OhShitItDidntWork; 
> 
> Seems pretty straightforward to me.  You handle the result codes you 
> know what to do with, 

There is a difference between things I don't personally know how to 
handle and things that the SQLite authors would consider "permanent" 
errors (where "permanent" is appropriately scoped e.g.  to the level of 
the function call, the sqlite3_stmt*, the sqlite3*, or so on).  

My ignorance of the API details is not a justification for writing a 
crappy client.  I would like to cure my ignorance, and the docs are 
insufficient in that regard AFAICT.  Hence my question to this forum.  

> and everything that remains means your program should explode 
> immediately and very very loudly dumping ALL relevant information (ie, 
> error code, statement, inout data, output data, etc.)  

SQLite is not the center of my universe.  The layer using it is not at 
the center of the program in which it resides.  The program I am writing 
is a high-availability, mission-critical system.  Barfing because of an 
SQLite error is unacceptable in my case, especially if the only reason 
the program is barfing is because the programmer who wrote it was a 
dummy (or at least more of a dummy than he should have been:-).  

And whether it barfs is sort of beside the point.  If I restart my 
program and give SQLite the same series of commands, should I expect 
it to fail in the exact same way, or should I expect its behavior 
to change?  Depends of course on the state of the system and on the 
particular series of calls.  If the documentation were sufficient, then 
it would allow me to reach the appropriate conclusion.  

I would first like to know exactly what the *authors* expect to be 
returned in various cases, so that I can code to that expectation.  

> In other words, there is no point checking for error conditions 
> which you cannot handle.  Instead you just die.  Quickly, noisily, 
> and immediately.  There is no point checking for the error message 
> SQLITE3_CPU_REGISTER_HIT_BY_COSMIC_RAY if there is nothing that you can 
> do about it.  Every error that you cannot handle is a fatal error, and you 
> should die accordingly.  

I can handle every kind of error.  It's just a question of how.  I 
don't know how in many cases, because I don't understand the semantics 
of the error codes, in turn because (I think) the documentation is 
insufficient.  It is not healthy to bake such ignorance into the 
application if I can help it -- again, I would rather just cure my 
ignorance (preferably not by digging through the code and reaching
my own flawed conclusions about it).  

-- 
Eric A. Rubin-Smith

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] think I need better error-handling guidance in the C API

2014-06-30 Thread Eric Rubin-Smith
I've been using the sqlite Tcl API for about 3 million years, with much
delight.  I'm venturing now into the C API -- hoping to put sqlite into a
large monolithic "always on" C++ daemon.  I would like to use the API in a
way perfectly in line with how the authors intended.

But it's sort of hard to figure out exactly what the authors intended in
some cases.

Consider the man page for sqlite3_step() (http://sqlite.org/c3ref/step.html
):

"With the "v2" interface, any of the other result codes
 or extended result codes
 might be returned as
well SQLITE_ERROR  means that a
run-time error (such as a constraint violation) has occurred.
sqlite3_step() should not be called again on the VM. More information may
be found by calling sqlite3_errmsg() .
With the legacy interface, a more specific error code (for example,
SQLITE_INTERRUPT , SQLITE_SCHEMA
, SQLITE_CORRUPT
, and so forth) can be obtained by
calling sqlite3_reset()  on the prepared
statement . In the "v2" interface, the
more specific error code is returned directly by sqlite3_step()."

I am a big fan generally of the clarity of the sqlite docs.  But this page
could use another section detailing all the possible result codes' specific
semantics under sqlite3_step(), and in particular what the client should do
in those circumstances.  Similar to how your normal linux system call man
page details such things.

The page bites a bit of that off, as you see above.  But consider e.g.
(from http://sqlite.org/c3ref/c_abort.html):

#define SQLITE_NOTICE  27   /* Notifications from sqlite3_log() */

Am I supposed to look for that?  What am I supposed to do with it if I get
it?  How about SQLITE_CANTOPEN?  Seems like that shouldn't be a possibility
by the time we hit sqlite3_step(), but the docs are silent on the matter.
So what am I to do?

If I get SQLITE_IOERR, is that always permanent, or do I need to look for
things like SQLITE_IOERR_LOCK and if that bit is set then rollback and
retry?


sqlite3_exec() has the same issue: it's hard to say from reading the docs
exactly which of the error codes might be returned, and what the caller
should do in those cases.

sqlite3_prepare_v2() again has the same issue:

"On success, the sqlite3_prepare() family of routines return SQLITE_OK
; otherwise an error code
 is returned."

Again, can *any* of those error codes be returned?  Which of them is
permanent and which are temporary in the case of prepare_v2()?

And so on.

Apologies if I have missed some of the docs.  Any pointers, or especially
example C code that the SQLite devs consider perfectly robust and complete
API usage, are much appreciated.

Eric
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-18 Thread Eric Rubin-Smith
Cory Nelson wrote: 

> Expand the prefix into the full feed:beef::etc 
> 
> Insert into a table (start binary(16), mask_length int) 
> 
> select top 1 binary,length from table where start <= @input order by 
> binary desc 
> 
> Check if the row is inside the range returned.  This will take a single 
> index seek.  

Looking at this again, I do not think the solution is correct.  E.g.  
assume you have populated 10m prefixes, one of which is ::/0.  Assume 
the search key is ::::::: and ::/0 
is the only covering prefix.  Then your scheme will not find ::/0.  And 
simple extensions of your scheme involve searching through the whole set 
to find ::/0.  

-- 
Eric A. Rubin-Smith

Money is the root of all wealth.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-18 Thread Eric Rubin-Smith
Cory Nelson wrote: 

> The phrase you're looking for here is "CIDR block".  

Well, I was avoiding the phrase on purpose :-).  I was worried that 
using another bit of jargon -- one that is even more opaque than 
"prefix" to someone unfamiliar with the space -- did not seem likely to 
help get the idea across.  But since you and this forum probably do not 
have a burning interest in the minutiae of my flawed writing process, I 
press on.  

> The way I'd handle this is something like this: 
> 
> Expand the prefix into the full feed:beef::etc 
> 
> Insert into a table (start binary(16), mask_length int) 
> 
> select top 1 binary,length from table where start <= @input order by 
> binary desc 
> 
> Check if the row is inside the range returned.  This will take a single 
> index seek.  

Um.  This looks, wow, much simpler and better than the R*Tree trick.

I guess the only question is whether the binary search into the 
(traditional) index will cost more than the R*Tree traversal.  In a set 
of 10m records we expect to bounce 23 times in a traditional index, if 
my math is right.  Not sure how that compares to the R*Tree.  

I'll see if I can get an apples-to-apples performance comparison 
going (and will reply back with the results, in case folks are still 
interested).  

Thank you!

-- 
Eric A. Rubin-Smith

I'm just glad it'll be Clark Gable who's falling on his face and 
not Gary Cooper.
-- Gary Cooper on his decision not to take the leading role in 
   "Gone With The Wind."
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-18 Thread Eric Rubin-Smith
Carlos Ferreira wrote: 

> Regarding the R.Tree performance problem, 
> 
> What is the original problem that is causing slow performance in the 
> SQlite R-Tree implementation?  

I was populating my DB with bad data.  In particular, I was first 
choosing a random prefix length, then filling up that number of bits to 
create a random prefix.  I was then just blindly sticking that into the 
database.  

For very short prefix lengths, the chance of an exact collision was 
therefore very high.  E.g.  if prefix length 0 is chosen, then the 
chances of collision are 100%.  And prefix length 0 is chosen 1% of the 
time.  

Collisions on the large bounding boxes are exactly what you don't want, 
because of the way an R*Tree works.  (Collisions in small bounding boxes 
only matter for the rare searches that happen to fall within them.)  

Fixed by checking for the existence of an identical bounding box, and 
removing it if it does exist.  (This population prototype code might 
eventually feed production code, which will find it more useful to have 
INSERT OR REPLACE semantics than INSERT OR IGNORE semantics.)  

The above, however, raises an interesting point.  Even when exact 
collisions are detected and avoided, the above randomized scheme does 
create a significant "matrioshka" structure that may not be present 
in real-world datasets.  That is, again, there is a 1/128 chance that 
length 0 is chosen.  There is an 8/128 or 6% chance that a prefix of 
length <= 8 is chosen.  I.e.  we are creating a lot of large bounding 
boxes that likely cover smaller ones, and that may or may not reflect
reality.

-- 
Eric A. Rubin-Smith

Computer programs don't like being anthropomorphized.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-18 Thread Eric Rubin-Smith
Simon Slavin wrote: 

> Strongly suspect that although R*Trees produce an elegant solution to 
> your problem, the fact that they're a general case tool will make them too 
> slow to use for something like this.  
> 
> I propose an alternative solution, though I have not tried it and do not 
> have time to try it (sorry).  
> 
> 1) A function which turns a numeric IP address or a block into some 
> standard easy-to-process representation in string form.  Possibly a 
> pair of strings with the first string being an address the second being 
> something indicating the extent of the block, perhaps something like 
> '2001:0db8:8500:::::v::ff00:::::'.
>   
> You could make it shorter by leaving out the colons but my experience is 
> that although this leads to less data stored on disk it doesn't speed up 
> processing by much.  But if you have a great deal of data you might want 
> to do it anyway.  
> 
> 2) A comparator function (maybe a SQLite function, maybe not) which 
> takes two such addresses or blocks and returns a value indicating whether 
> they're identical, whether block A contains block or address B, or 
> neither.  
> 
> The closest I got to the above was when I needed a program which 
> intensively searched and sorted individual IPv4 addresses.  I got best 
> results by defining a SQLite function which converted IP addresses of all 
> formats into 'standard format' where each byte was two hex digits.  All 
> values stored in the database were stored in my 'standard' format.  This 
> allowed easy collation using standard text sorting.  

Thanks for the suggestion, Simon.  

My use case is intensive on the search side, but will incur occasional 
updates to the structure.  No sorting necessary from an application 
perspective.  Perhaps I am being dense, but don't see how your 
representation eases the burden of longest-prefix matching from within 
SQL queries.  

> Everything else turned out faster to implement in my own programming 
> language than it was to build as SQLite functions.  

Yeah, I agree that the performance of a dedicated data structure will be 
far better.  Again, just wondering if I can stretch SQLite to solve this 
problem, because it would be oh-so-nice to leverage all the other stuff 
SQLite gives us.  

-- 
Eric A. Rubin-Smith

:wq
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-18 Thread Eric Rubin-Smith
Carlos Ferreira wrote: 

> 1 - There a data type named IPV6 Address.  2 - there is a table where 
> this data type must be in.  ( can be a set of fields, one blob, one string 
> ...)  
> 
> You want to: 
> 
> Given a certain IPV6, find in the database the existent IPV6 record with 
> the longest equal prefix to the given IPV6 value.  

Not quite.  Perhaps you were confused by my (probably unclear) use of 
the word "prefix".  

The data structure contains a set of IPv6 *network prefixes*.  A prefix 
is the first N bits of an IPv6 address and is denoted as an IP address 
with a suffix of (128-N) bits of zeros, along with the length of the 
prefix: 

feed:beef::/32 

(here N==32).  

An IPv6 address is inside this prefix iff its first 32 bits are equal to 
feed:beef.  

Note that one prefix can be contained within another: 

feed:beef:f00d::/48 is fully contained within feed:beef::/32.  We say 
that feed:beef:f00d::/48 is "more specific" than feed:beef::/32.  

The problem (one that comes up quite often, e.g.  in routers) is this: 
given an IP address, find the longest-length prefix containing the 
address.  I.e.  find the most specific prefix containing the address.  

This is different from the problem you stated.  One key thing to note is 
that the two prefixes feed:beef::/32 and feed:beef::/48 are different, 
even though the bits in the address parts are identical.  

> Again, if the problem is as I understood, the simplest solution is ( 
> again I am discussing it as if it could be implemented or available in 
> SQLite..I do not know..)  
> 
> 1 - encode the IPV6 field as a unique blob 2 - Implement an index to 
> this field so that this particular field can be ordered 3 - Make sure that 
> the ordering compare function is a binary incremental compare counting 
> from the left ( in terms of the data...not sure how you will represent 
> it in the field ) 4 - Each time you want to find the closed and longest 
> prefix, you just need to simulate an insert command.  Try to insert the 
> given value.  Instead of inserting, just return the ordered position where 
> it would be inserted.  Check what is the record actually standing in that 
> ordered position...That would be your result!!  

The problem has many solutions outside of SQL.  The most common data 
structure is a "trie" (pronounced "tree" and short for "reTRIEval 
tree"), though it turns out that in many subsets of this problem space 
other data structures turn out to be more efficient.  

The present motivation, however, is to see if we can leverage all 
the ancillary sexiness of SQLite while still getting reasonably fast 
searches within this problem space.  Turns out we can do pretty darn 
well with SQLite in this regard.  

The key part is coming up with an isomorphism between overlapping 
IPv6 prefixes and the overlapping geometric boxes represented in a 
5-dimensional R*Tree.  Not just any isomorphism, but one with the 
property that for prefixes P1 and P2, P1 contains P2 if and only if 
bbox(P1) fully contains bbox(P2).  (It follows that the volume of 
bbox(P1) is larger than the volume of bbox(P2), so you can sort by the 
volumes of all the covering bboxes to find the most specific prefix.  
Though my SQL had a bug in that regard, so treat it with care if you use 
it.:-) 

Again, the 5 dimensions are only needed because the R*Tree's integers 
are only 32 bits wide.  If they were 128 bits wide, you could just 
represent an IPv6 prefix as an interval on the line segment [0, 
2^128-1], and store those intervals in a 1-dimensional R*Tree (which 
works great for IPv4, btw).

-- 
Eric A. Rubin-Smith

I still maintain the point that designing a monolithic kernel in 
1991 is a fundamental error.  Be thankful you are not my student.  
You would not get a high grade for such a design.
-- Andrew Tanenbaum, to Linus Torvalds
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-15 Thread Eric Rubin-Smith
On Sun, Jun 15, 2014 at 9:47 AM, Eric Rubin-Smith <eas@gmail.com> wrote:

> Richard Hipp wrote:
>
> > What does this query return?
> >
> >  SELECT count(*) FROM ipIndex
> >   WHERE minD1 <= 1220818432 and 1220818432 <= maxD1
> > AND minD2 <= 2120561472 and 2120561472 <= maxD2
> > AND minD3 <= 1685398080 and 1685398080 <= maxD3
> > AND minD4 <= 1685755328 and 1685755328 <= maxD4
> > AND minD5 <= 538331072 and 538331072 <= maxD5;
>
> Hm, it returns 1645.  This indicates a bug (the max expected value is
> 128).  I'm now highly suspicious of my mathematical reasoning or my
> code.  I'll take a look.  Thanks, Richard!
>


Follow-up for those who are curious.  My program for randomly populating
the database was creating a bunch of identical bounding boxes for
short-length prefixes (i.e. prefixes corresponding to large bounding
boxes).  This made the R*Tree do a bunch of redundant work.  Eliminating
this issue led to a ~30x throughput improvement to ~6k searches per second
on a database with 100k prefixes in it.

After populating the database with 5.7 million such prefixes, we are at a
throughput of about 2.2kTPS.  Not horrible, and not sure I can expect much
more out of SQLite -- but still not good enough for my use case
(unfortunately).  Any further optimization tips are highly welcome.  In the
mean time, I'm going to keep digging.

Thanks again to Richard for pointing me in the right direction.

Eric
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] slowish R*Tree performance

2014-06-15 Thread Eric Rubin-Smith
Richard Hipp wrote:

> What does this query return?
> 
>  SELECT count(*) FROM ipIndex
>   WHERE minD1 <= 1220818432 and 1220818432 <= maxD1
> AND minD2 <= 2120561472 and 2120561472 <= maxD2
> AND minD3 <= 1685398080 and 1685398080 <= maxD3
> AND minD4 <= 1685755328 and 1685755328 <= maxD4
> AND minD5 <= 538331072 and 538331072 <= maxD5;

Hm, it returns 1645.  This indicates a bug (the max expected value is
128).  I'm now highly suspicious of my mathematical reasoning or my 
code.  I'll take a look.  Thanks, Richard!

Eric

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] slowish R*Tree performance

2014-06-14 Thread Eric Rubin-Smith
I am exploring a mechanism for finding the longest covering IPv6 prefix for
a given IPv6 address by leveraging SQLite 3.8.5's R*Tree feature.  I'm
getting pretty bad performance and would like to know if I'm doing
something obviously wrong.

A 1-dimensional R*Tree of integers of width 128 bits would be ideal.  But
SQLite R*Trees are only 32 bits wide per dimension.

So I am modeling IPv6 prefixes as a pair of coordinates in 5-dimensional
integer space:

CREATE VIRTUAL TABLE ipIndex USING rtree_i32(
id, -- Integer primary key
minD1, maxD1,
minD2, maxD2,
minD3, maxD3,
minD4, maxD4,
minD5, maxD5
);

CREATE TABLE routeTarget(
id INTEGER PRIMARY KEY,
prefix TEXT NOT NULL,
target TEXT NOT NULL
);

To do this, I have come up with a mapping from an IPv6 prefix to a pair of
coordinates that has the geometric property that we would like: bounding
box 1 contains bounding box 2 if and only if prefix 1 contains prefix 2.
So if a search for bounding boxes containing a particular address's
coordinate returns rows, then each of those rows corresponds to a covering
prefix -- from there, we must simply find the smallest bounding box to find
the longest-matching prefix.

Functionally, this appears to work like a charm.

Performance, unfortunately, leaves much to be desired.  I have seeded this
database with 100k randomly-generated prefixes, and am only able to push
through 250 searches per second.  I am hoping to increase the database size
to ~50m records and would like to hit 50k searches per second.

This is not an unreasonable request based on my hardware, and SQLite has
easily hit this throughput (at least, this order of magnitude in
throughput) in other applications.  For example, the analogue in IPv4
merely requires a 1-dimensional R*Tree, and with that I was able to hit
10kTPS throughput without breaking much of a sweat.

Here is my search query plan:

sqlite> explain query plan SELECT prefix, target FROM routeTarget WHERE id
= (
   ...>SELECT id FROM ipIndex
   ...> WHERE minD1 <= 1220818432 and 1220818432 <= maxD1
   ...>   AND minD2 <= 2120561472 and 2120561472 <= maxD2
   ...>   AND minD3 <= 1685398080 and 1685398080 <= maxD3
   ...>   AND minD4 <= 1685755328 and 1685755328 <= maxD4
   ...>   AND minD5 <= 538331072 and 538331072 <= maxD5
   ...> ORDER BY ((maxD5-minD5)*(maxD4-minD4)*(maxD3-minD3)*
   ...>   (maxD2-minD2)*(maxD1-minD1)) ASC
   ...>LIMIT 1);
0|0|0|SEARCH TABLE routeTarget USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|EXECUTE SCALAR SUBQUERY 1
1|0|0|SCAN TABLE ipIndex VIRTUAL TABLE INDEX 2:B0D1B2D3B4D5B6D7B8D9
1|0|0|USE TEMP B-TREE FOR ORDER BY

I created a profiled SQLite build, and here are the top offenders for a run
on 1000 searches:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls  ms/call  ms/call  name
 40.00  1.58 1.58   300179 0.01 0.01  sqlite3VdbeExec
  6.33  1.83 0.25 36628944 0.00 0.00  sqlite3VdbeMemMove
  6.08  2.07 0.24 18314472 0.00 0.00  rtreeColumn
  4.05  2.23 0.16  1665952 0.00 0.00  rtreeStepToLeaf
  2.41  2.33 0.10 55549722 0.00 0.00  sqlite3VdbeMemRelease
  2.28  2.42 0.09  1965104 0.00 0.00
sqlite3BtreeMovetoUnpacked
  2.03  2.50 0.08 20187085 0.00 0.00
rtreeNodeOfFirstSearchPoint
  1.90  2.57 0.08 19981424 0.00 0.00
sqlite3VtabImportErrmsg
  1.77  2.64 0.07  1663952 0.00 0.00  sqlite3BtreeDelete
  1.52  2.70 0.06  5297026 0.00 0.00  sqlite3VdbeSerialGet
  1.52  2.76 0.06  1665952 0.00 0.00  btreeMoveto
  1.27  2.81 0.05 29969136 0.00 0.00  numericType
  1.27  2.86 0.05 22092688 0.00 0.00  sqlite3DbFree
  1.27  2.91 0.05 16649521 0.00 0.00  sqlite3_result_int
  1.27  2.96 0.05  5295009 0.00 0.00  moveToRoot
  1.27  3.01 0.05  1844945 0.00 0.00  nodeAcquire
  1.27  3.06 0.05  1665952 0.00 0.00  insertCell
  1.27  3.11 0.05  1663952 0.00 0.00  dropCell
  1.01  3.15 0.04 21214814 0.00 0.00  sqlite3_free
  0.89  3.19 0.04  3335904 0.00 0.00  pager_write
  0.76  3.22 0.03  9991712 0.00 0.00  sqlite3VdbeSerialType
  0.76  3.25 0.03  3335904 0.00 0.00  sqlite3PagerWrite
  0.76  3.28 0.03   903468 0.00 0.00  pcache1Fetch


I believe I have avoided the common pitfalls: everything is within a single
transaction, my cache_size pragma is large, etc.

I note from SQLite trace callbacks that a particular explicit search
results in a great many implicit SQL calls into the underlying R*Tree
tables.  Three hundred or so per search.  I assume this is expected?  It
seems pretty large.  They all look like this:

-- SELECT data FROM 'main'.'ipIndex_node' WHERE nodeno = :1
-- 

[sqlite] lifetime of buffer referred to with SQLITE_STATIC

2014-06-13 Thread Eric Rubin-Smith
If I say sqlite_bind_text(...SQLITE_STATIC), I am promising that the buffer
is going to stick around for a while.  How long am I promising that it will
stick around?  Til the next statement reset()?  Til the statement
finalize()?  Til the database close()?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] ALWAYS/NEVER and assert()

2013-06-12 Thread Eric Rubin-Smith
I note that there are some assert() statements spread through the SQLite
code base and not isolated just to the ALWAYS and NEVER macros.  Why did
the authors choose to write, for example,

  assert( pName==0 );

in sqlite3CreateIndex, rather than

  ALWAYS( pName==0 );

?

For the purposes of testing coverage, do the developers compile SQLite with
NDEBUG?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] minor typos in lang_corefunc.html's description of abs(X). domain of abs(X)

2013-02-25 Thread Eric Rubin-Smith
"The abs(X) function returns the absolute value of the numeric argument X.
Abs(X) returns NULL if X is NULL. Abs(X) return 0.0 if X is a string or
blob that cannot be converted to a numeric value. If X is the integer
-9223372036854775807 then abs(X) throws an integer overflow error since
there is no equivalent positive 64-bit two complement value."

* The int should be -9223372036854775808.

* Also, "return 0.0" should be "returns 0.0".

* Also, "two complement" should be "two's complement".

Minor typos aside, it's interesting that SQLite is nice enough to convert
integers strictly less than -1 * 2^63 to floats and then take the floating
absolute value.

This means that the only gap in the function's domain is at the sole
integer value -1 * 2^63.  One greater and SQLite gives you a value; one
less and SQLite gives you a value:

sqlite> select abs(-9223372036854775807);
9223372036854775807
sqlite> select abs(-9223372036854775808);
Error: integer overflow
sqlite> select abs(-9223372036854775809);
9.22337203685478e+18

Given that, and given SQLite's general philosophy of being flexible about
its inputs, would it make sense for SQLite to convert -1 * 2^63 to a float
and return the floating absolute value, as it does for (-1 * 2^63) - 1?

I can also see the arguments against that choice: users who were careful to
store a 64-bit integer into a field may be surprised when it gets converted
to a float when passed through abs().  You also have backward compatibility
considerations.

Thanks as always to drh and team for the awesome tool.

Eric
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users