Re: [sqlite] Implementing a statement cache

2019-12-21 Thread Roger Binns
On 16/12/2019 13:38, carsten.muencheberg wrote:
> I am working on a generic cache for prepared statements 

I really wish SQLite had a statement cache behind the scenes, so that
devs don't have to keep re-implementing a statement cache.  I would be
delighted to delete the statement cache code in APSW.

> The cache is a simple map from an SQL string to a statement pointer.

Note that you will need SQLite to parse the input string to get the SQL
string.  For example your API could be called with "select 3; select 4;"
and would need to be broken in the middle into two statements.  You need
to get all this stuff right.

I used the same map approach which requires you keeping an in-use flag
for the statement pointer.  This is because you may be supplied the same
SQL twice without the first being released yet.  My implementation only
has one statement per SQL text meaning additional executions of the same
SQL do not use the cache.

There are multiple copies of the SQL text too.  One copy will be in
whatever calls you, you need a copy to use for the key in the map, and
then SQLite internally keeps a third copy.  You can avoid that third
copy by using the v1 prepare method and handling SQLITE_SCHEMA yourself.
 It would be so much better if SQLite had the cache internally.

> 1. When to call sqlite3_reset()? It looks like the safest and easiest
> approach is to call sqlite3_reset() immediately after retrieving a
> statement from the cache.

Do so immediately when you are finished with the statement (eg about to
put it back in the cache).  That will release all the locks etc, as well
as free memory - eg if a binding is a long string or blob.

> Is there any disadvantage in regards to
> concurrency or performance in keeping dozens or hundreds of statements
> alive in a non reset state e.g. SELECT statements which have not stepped
> over all rows?

You will have considerably more memory allocated, in addition to held
locks etc.  Note that cleanup is going to happen.  You could do it all
at the very end, or I prefer to do it as soon as possible to keep the
footprint more compact.  The cache is supposed to be transparent.

> 2. When to call sqlite3_clear_bindings()?

Same thing - the sooner the better.

> but calling
> sqlite3_clear_bindings() can be a safeguard against accidentally
> executing a statement with old values?

If the cache is transparent then you must do so to avoid very hard to
diagnose bugs.

> 3. When to clear the cache?

My implementation has the developer specify the number of entries in the
cache (default 100).  In addition to the mapping between SQL text and a
statement, there is a linked list between the statements tracking least
recently used.  This is a fairly complex combined data structure, and
another reason SQLite should do it (one place to get right).

> I read that in some cases statements are automatically recompiled 

This is not relevant to a statement cache, and if you use the currently
documented APIs it is something you do not need to know or care about ever.

Behind the scenes each statement is transformed into byte code which is
what SQLite runs to perform a query.  This is necessary because you get
a result row at a time, so SQLite has to be able to suspend and resume
execution. https://sqlite.org/vdbe.html

For example the vdbe will mention column 3, and if the table schema has
changed, the same named column could now be column 4.  In the olden
days, if you tried to execute the vdbe SQLite would detect it was out of
date, return an error code (SQLITE_SCHEMA) and the developer would have
to reprepare the statement.  Now SQLite keeps a copy of the SQL text and
does the reprepare internally and transparently.

> 4. Other ideas, comments?

Keep asking the SQLite team to make an internal SQLite statement cache.
 I'd be happy to call different APIs even.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3_complete and comments

2018-10-14 Thread Roger Binns
I use sqlite3_complete in my shell in order to determine when a complete
statement has been input and can be run.  (Otherwise a continuation
"sqlite> " prompt is shown.)

If the line entered is:

   -- hello

Then the sqlite shell does not issue a continuation and "executes" the
text.  However sqlite3_complete does not say that line is complete so
the command line shell has extra logic to figure this out.  What is the
right way of considering line comment complete as the SQLite shell?

These lines get True from sqlite3_complete:

  select 3; --
  select 3 /* */ ;

And these get False:

  select 3
  --
  -- ;



Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Best practices for forward conversion of database formats?

2018-08-30 Thread Roger Binns
On 30/08/18 09:51, Randall Smith wrote:
> is how to convert existing data from one DB format to another, given some 
> arbitrary set of changes in the database schema in the interim.

I use SQLite's user pragma.  It starts at zero.

https://sqlite.org/pragma.html#pragma_user_version

My code ends up looking like this:

if user_version==0:
CREATE TABLE IF NOT EXISTS events(key, time, message);
PRAGMA user_version=1;

if user_version==1:
CREATE INDEX IF NOT EXISTS [events:message] ON events(message);
PRAGMA user_version=2;

if user_version==2:
ALTER TABLE events ADD COLUMN severity;
PRAGMA user_version=3;

This ensures that the currently running code will upgrade the schema as
needed.  Ensure the commands are wrapped in a transaction so they either
completely happen or not.

I am helped by having low complexity schemas.  If yours are large you
could probably generate something like the above.  Some ORM style
engines also have schema and data upgrade functionality.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Size of the SQLite library

2018-06-06 Thread Roger Binns
On 06/06/18 09:24, Bob Friesenhahn wrote:
> A local tool which makes it easy to configure sqlite from local files
> sounds useful ...

It already exists.  It is what the SQLite team uses to produce the
amalgamations etc, and is part of the SQLite code base.

> but depending on a "web site" (baby-bird model) ...

Note that behind the scenes the existing tools would be used with the
relevant results zipped up and downloadable.  No one is advocating
getting rid of the command line tools, just a web front end.

> There is already far too much dependence on what what
> happens to get served up at the time and too much dependence on a live
> connection to the "Internet" ...

You and Warren comprehensively describe best practises and why.  You are
both right.  It is what developers *should* do for repeatable reliable
builds.

But it is a lot of friction.  And not every developer follows best
practise.  And developers start out investigating and playing around
with potential solutions, and then adopt the appropriate ones.

If you are trying out a "hello world" quick test, then the best
practises are a lot of friction, and a few web page tickboxes are the least.

The more friction there is, the fewer people will try non-default
configurations.  But that also locks SQLite into a pessimistic legacy
configuration going forward.  For example default enabling STAT4 or
disabling deprecated API could not be done, ever.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Size of the SQLite library

2018-06-06 Thread Roger Binns
On 05/06/18 15:07, Warren Young wrote:
> All right, so include [multi-component source control and build process] ...

I'm still not sure what point you are trying to make.  Yes *you* can do
that.  Should *every* SQLite user who wants non-default options *have*
to go through a similar amount of friction?

SQLite currently only has one distribution.  This distribution has to
fit most user needs regarding backwards and forwards compatibility
(including query plans), functionality, size etc.

*If* SQLite wants to step away from one size/configuration fits most,
then there needs to be way less friction in getting the alternate
configurations.  One solution is a small number of alternate downloads
("presets"), although it is hard to know what configurations they should
have.

That is why I advocate a web site where the user (un)ticks what they
want, and the web site provides a correctly configured download.  This
will also tell the SQLite developers what features are configured.  (eg
if everyone turns off virtual tables that is useful feedback, as would
the opposite.)

> Thus the need for curated collections of build options, since a jQuery UI 
> like tool that assumes the options are all orthogonal would frequently 
> produce unbuildable output.

Huh?  No one is advocating a SQLite web tool that produces unbuildable
output, or offers every possible combination of options.  It would need
to be useful, and can start simple.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Size of the SQLite library

2018-06-05 Thread Roger Binns
On 01/06/18 13:46, Warren Young wrote:
> Your jQuery example later on doesn’t much apply here, for several reasons:

Note that I was showing how the site let you choose whatever features
you want, and then gave you a download matching exactly that.

> 1. JavaScript is a dynamic language, while C is a statically-compiled 
> language.  

Your comments while correct don't actually apply to SQLite.  SQLite is
not a C file.  It is many C source files, many headers, a grammar (not
in C), and various tools (typically TCL).

For example to exclude virtual tables from SQLite, you can't just add a
compile time option and be done.  You have to regenerate from  the
grammar (so it is no longer valid SQL syntax and no longer has calls to
virtual table relevant functions).  And then you almost certainly want
to use the tool to make the amalgamation from the updated grammar.  And
then you need to make sure your Makefile or equivalent passes in the
omit flag too.

The web site doing all that work for you, and getting it right every
time does have value IMHO.  It also makes it easier for SQLite to have
bigger or smaller presets to address the varying developer needs.  And
the team will have some idea of what OMITs are used, where testing
should check etc.

> That means that all of the symbols needed to link the program ...

That was nothing to do with the issue.  To be very clear:

* SQLite has a way of omiting functionality

* Other than a few special cases, you must use the SQLite source (not
the amalgamation) to regenerate what you finally use

* Doing this is difficult and error prone

> Contrast a language like JavaScript, where you can ship a program that has 
> calls to functions that don’t exist, and as long as you continue to not call 
> those functions, the JS VM won’t balk.

You can do lazy runtime linking in some operating systems so functions
to calls that don't exist are ok (until you call them).

But in any event JS code is not distributed how you think.  Minified
source is usually used, and works best if run through dead code
elimination first (called "tree shaking" in the JS world).  ie the
distribution isn't that different to SQLite (amalgamation).

> 2. There are ways around this with C,

My point is that it isn't.  You cannot add / remove defines to the
amalgamation to omit most features.  Heck if you try it just won't
compile.  More work has to be done.  The mailing list archives have many
messages where people tried a few compile flags and it didn't work.

> One could write a variant of cpp that would run on the sqlite3.c amalgamation 
> ...

It won't work for anything grammar related.  And the project has tools
like you describe (eg it is how the amalgamation is produced).  Those
tools are in TCL and know about the structure and coding patterns of
SQLite.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Size of the SQLite library

2018-05-31 Thread Roger Binns
On 31/05/18 10:15, Richard Hipp wrote:
> Size is still important.  But having useful features is important too.
> I'm continuing to work to find the right balance between these
> competing goals.

A pattern used in other projects is to have standard downloads, as well
as custom ones.  With the latter you can include or exclude additional
components and features.  You can already do this with SQLite, but it
requires several more command line tools, programming languages, and
comprehensive reading of the documentation.

Perhaps a custom download web page that gives you some presets
(smallest, default, everything) or lets you choose your own settings.
It would then produce known good source files, and users would be happy.

Here is an example page for a Javascript project:

  https://jqueryui.com/download/

On the balance side, STAT4 is a good example.  I think it would benefit
the majority of SQLite users if it was enabled by default.  But making
only that change could change query plans for existing users.  (Many
users also don't compile SQLite itself - they get binaries from the
platform, or language bindings.)

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite between forks

2018-03-26 Thread Roger Binns
On 26/03/18 13:30, Simone Mosciatti wrote:
> However I fail to see how this can be a problem for purely in-memory
> database.

When a process forks, only the thread that called fork is kept in the
new child process.  Also note that semaphores (and locks in general) are
left in the same state as at the time of the fork call, and are not
reset in the child (forked) process.

A very simple example of a problem would be if a thread in the parent
process is inside a SQLite call and another thread calls fork().  The
currently active locks will remain in the child so SQLite operations
will just block forever in the child.

That can be avoided providing you use no threads, the libraries you use
have no threads, and the libraries those libraries use have no threads.
That will be increasingly unlikely over time.  If you do use threads,
then you would have to make things safe by controlling exactly when
forks can happen (assuming no libraries thwart you).

This is very tedious work that is very difficult to statically check,
let alone verify everything is correct in every possible code path.
Hence the recommendation to not use SQLite across forks.  I address that
by making sure only leaf processes do SQLite work.

For my Python SQLite wrapper, I added a fork checker to help with
testing and make sure you don't use SQLite across forks.  The way it is
implemented is replacing the SQLite mutex operations (there is a table
of them) with ones that check the process id, and then call the original
mutex operation.  In my test of a benchmark where 100% of the code was
doing SQLite operations, there was a slowdown of 1%.

https://rogerbinns.github.io/apsw/apsw.html#apsw.fork_checker

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Linux top command and sqlite

2017-02-21 Thread Roger Binns
On 21/02/17 10:22, Kevin O'Gorman wrote:
> Some of my stuff takes a while to run, and I like to keep tabs on it.
> Right now, I'm running one of those, and the Linux top command shows
> extremely small CPU usage, and a status ("S" column) of "D" which the man
> page defines as "uninterruptable sleep".  Huh?

In this case you can read D as waiting on disk.

Unless you are using multithreading, each query that needs to get data
not already in cache will result in disk activity.  And the code will
not continue until it gets the data from the disk.  The waiting is why
CPU usage is low.

You can use multithreading to allow more concurrent disk I/O although
Python's GIL complicates matters.  You can also tune the caches used by
SQLite.  If your database is smaller than your RAM then you can
prepopulate the OS cache with the database file contents.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Beginning of release testing for version 3.17.0

2017-02-09 Thread Roger Binns
On 08/02/17 11:41, Cecil Westerhof wrote:
> ​OK, glad to help. What should I do?

It is nicest if whatever software/tools you already have also has some
sort of testing (ideally automated, but a manual checklist works too).

Then run the testing with the existing version of SQLite, and repeat
with the new (draft) version.  You should see no differences in the
results from SQLite (correctness).  You may see some performance
improvements, and if very unlucky regressions.

Then report those differences.  Correctness is the most important, but
performance is relevant.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [sqlite-dev] Beginning of release testing for version 3.17.0

2017-02-07 Thread Roger Binns
On 07/02/17 08:56, James K. Lowden wrote:
> I must be having a bad day.  Both Google and cscope fail to turn up any
> reference to NULL_TRIM.  What are you referring to, and what does it
> do?  

I started at the changelog posted at the beginning of this thread which
makes everything clear/linkable:

  https://www.sqlite.org/draft/releaselog/3_17_0.html

Then from the changelog my concerns are if have to wrap any new
functions, and how to test relevant changed/updated parts.  (I'm the
author of a Python wrapper for SQLite.)

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why this query plan?

2017-01-12 Thread Roger Binns
On 11/01/17 16:49, Richard Hipp wrote:
> For years I have threatened to make it a feature of SQLite that it
> really does output the rows in some random order if you omit the ORDER
> BY clause - specifically to expose the common bug of omitting the
> ORDER BY clause when the order matters.

And for years that has been one of the things mentioned in the (closed)
lint mode ticket :-)


https://www.sqlite.org/src/tktview/25e09aa2ab20d67a241b0164ac6818a27ea248d0

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speaking of Ruby & Sqlite...a weekend oddity for someone

2016-12-09 Thread Roger Binns
On 09/12/16 14:09, Don V Nielsen wrote:
> However,
> it fails using the sqlite3 gem. The specific exception is "in
> 'initialize': near "with": syntax error

That will be a SQLite version issue - older SQLite's didn't support
with.  While you made some effort around versions, whatever is happening
there is an older library is actually used.

You can use this to find out exactly what library version is used:

  SELECT sqlite_version(), sqlite_source_id();

Also this will give a list of compilation options, to verify:

  PRAGMA compile_options;

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interactive shell tab completion for queries

2016-12-09 Thread Roger Binns
On 09/12/16 14:18, Dominique Devienne wrote:
> How did you implement completion in APSW?

[Long answer] at bottom.

You get a callback from the readline or equivalent library when
completion is needed, with it expecting you to provide 0 or more
completions.  You can get the input line, the current token within that
line, and the beginning and end position of the token.

If the line starts with dot, then dot command completion code is called
which is fairly straight forward.

Otherwise SQL completion is used.  There is a cache of database, table,
column, collation, function etc names that is discarded before beginning
input of a new line.  This is because the previously executed line could
have caused changes in any of those.

> Given that you typically don't know the context (tables) involved
> when writing the select clause for example, do you just propose any column
> or declared functions?

I have a branch with an unfinished SQL parser in it, intended to handle
incomplete SQL such as when doing completions.  (Most parser solutions
require complete input, or they error.  What you really want to know is
what kind of tokens or grammar constructs are allowed at the cursor
position.)  Anyway that is not used, yet.[SQLite parser]

I have some code that deals with pragmas since they have a known
construct.  For everything else I just return everything from the cache
that matches the token so far.  Trying to be intelligent without a deep
knowledge parser as described in the previous paragraph isn't really
feasible since virtually anything is allowed anywhere.  Heck try to work
out what isn't allowed immediately after SELECT!  (Allowed includes
database, table, column, function names, some operators, strings,
numbers etc)  Also ponder AS/aliases.

I do make sure that the completions match the case of what was provided
so far: eg SEL has SELECT as completion while sel has select.  Functions
also have the opening parentheses as part of the completion - eg count(
- and closing if no arguments like in random().

It does mean that hitting tab after typing SELECT gives a lot of
choices, but even a few characters is enough to have a short list of
candidates and feel natural.

> There even isn't a way to list declared functions, despite wishes for a
> pragma for it, so can't even
> provide completion in a generic way, especially with dynamically loaded
> modules.

That and lint mode are my regular whines.  I do have a table of builtin
functions in the completion code, but there is no possible way for third
party code to get the function list from SQLite.

[Long answer]
https://github.com/rogerbinns/apsw/blob/master/tools/shell.py#L2550

[SQLite parser] The SQLite parser can't be used either because it
expects complete input, and the rules have code attached that aren't
helpful for this.  I'd dearly love to stay in sync with the SQLite
grammar, but using the canonical grammar file isn't practical.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interactive shell tab completion for queries

2016-12-09 Thread Roger Binns
On 09/12/16 14:18, Dominique Devienne wrote:
> How did you implement completion in APSW?

[Long answer] at bottom.

You get a callback from the readline or equivalent library when
completion is needed, with it expecting you to provide 0 or more
completions.  You can get the input line, the current token within that
line, and the beginning and end position of the token.

If the line starts with dot, then dot command completion code is called
which is fairly straight forward.

Otherwise SQL completion is used.  There is a cache of database, table,
column, collation, function etc names that is discarded before beginning
input of a new line.  This is because the previously executed line could
have caused changes in any of those.

> Given that you typically don't know the context (tables) involved
> when writing the select clause for example, do you just propose any column
> or declared functions?

I have a branch with an unfinished SQL parser in it, intended to handle
incomplete SQL such as when doing completions.  (Most parser solutions
require complete input, or they error.  What you really want to know is
what kind of tokens or grammar constructs are allowed at the cursor
position.)  Anyway that is not used, yet.[SQLite parser]

I have some code that deals with pragmas since they have a known
construct.  For everything else I just return everything from the cache
that matches the token so far.  Trying to be intelligent without a deep
knowledge parser as described in the previous paragraph isn't really
feasible since virtually anything is allowed anywhere.  Heck try to work
out what isn't allowed immediately after SELECT!  (Allowed includes
database, table, column, function names, some operators, strings,
numbers etc)  Also ponder AS/aliases.

I do make sure that the completions match the case of what was provided
so far: eg SEL has SELECT as completion while sel has select.  Functions
also have the opening parentheses as part of the completion - eg count(
- and closing if no arguments like in random().

It does mean that hitting tab after typing SELECT gives a lot of
choices, but even a few characters is enough to have a short list of
candidates and feel natural.

> There even isn't a way to list declared functions, despite wishes for a
> pragma for it, so can't even
> provide completion in a generic way, especially with dynamically loaded
> modules.

That and lint mode are my regular whines.  I do have a table of builtin
functions in the completion code, but there is no possible way for third
party code to get the function list from SQLite.

[Long answer]
https://github.com/rogerbinns/apsw/blob/master/tools/shell.py#L2550

[SQLite parser] The SQLite parser can't be used either because it
expects complete input, and the rules have code attached that aren't
helpful for this.  I'd dearly love to stay in sync with the SQLite
grammar, but using the canonical grammar file isn't practical.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interactive shell tab completion for queries

2016-12-09 Thread Roger Binns
On 09/12/16 06:29, daveparr...@tutanota.com wrote:
> I'm writing to ask if it is possible to have table completion for SQL queries 
> in the interactive shell? 

The APSW shell (compatible with the SQLite one) has completion and
colouring:

  https://rogerbinns.github.io/apsw/shell.html

Disclaimer: I am the author

You don't need to know anything about python etc to use it.

Implementing completion is also quite fun if anyone wants to discuss it
further.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Scope of sqlite3_update_hook?

2016-12-04 Thread Roger Binns
On 01/12/16 16:51, Jens Alfke wrote:
> If so, then does that include connections in other OS processes? (I'm looking 
> for a way to detect this.)

You can't get a callback when other processes change the database for
many reasons.  However it is possible to detect if the database has changed:

  https://www.sqlite.org/pragma.html#pragma_data_version
  https://www.sqlite.org/fileformat2.html#file_change_counter

In theory some combination of file change notification from the OS (eg
inotify) and inspecting the database should come close to meeting your
needs.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Pragma to flag unknown pragma?

2016-11-25 Thread Roger Binns
On 22/11/16 16:08, Scott Hess wrote:
> Something like "PRAGMA pedantic_pragma = on" would be much slicker.

SQLite lets you do "natrual" joins too.  I'd argue all this falls under
a lint mode that helps conscientious developers make sure everything is
working correctly under the hood.


https://www.sqlite.org/src/tktview/25e09aa2ab20d67a241b0164ac6818a27ea248d0

Sadly the SQLite developers rejected it.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] problem or not problem around 'begin immediate'

2016-11-25 Thread Roger Binns
On 25/11/16 12:02, Richard Hipp wrote:
> Rather, I presume
> that Python has recently started using the sqlite3_stmt_readonly()
> interface in a new way.

The bigger picture may be helpful.  There is a third party module
developed under the name "pysqlite" which has a long and storied
history.  At some point a copy was folded into Python as a module named
"sqlite3".  There are periodic copies of code changes between the two.

Python has a database API specification named DB-API (PEP 249).  This
mandates common behaviour no matter what the underlying database.
Transactions are expected to be started automatically, committed
automatically (under some circumstances I think), and commit / rollback
are methods on a cursor object.  This presumably matches how Postgres,
Oracle etc function.

Since SQLite doesn't work that way, the pysqlite authors did it
manually.  The execution code would manually parse each statement,
determine what kind of statement it was (makes changes means silently
start a transaction) and behave appropriately.  Needless to say, parsing
statements had various bugs.  Eventually they decided to use
sqlite3_stmt_readonly() instead of parsing, which is how the current
situation arose.  ie the API is used to try and simulate the behaviour
of other databases.

pysqlite does have an option (off by default) to avoid all this silent
transaction stuff.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-21 Thread Roger Binns
On 19/11/16 08:08, Kevin O'Gorman wrote:
> System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
[...]
> System without this problem: Running Ubuntu Linux 14.04.5, Python 3.4.3.

You are good on Python versions then.  My remaining recommendation is to
make the process that does SQLite be a child process (ie no making its
own children).  That will eliminate an entire class of potential
problems, although it appears unlikely you are experiencing any of them.

The final option is to run the process under valgrind.  That will
definitively show the cause.  Do note however that you may want to
change some of the default options since you have nice big systems.  For
example I like to set --freelist-vol and related to very big numbers
(several gigabytes) which ensures that freed memory is not reused for a
long time.  You could also set the valgrind option so that only one
thread is allowed - it will catch inadvertent threading you may note be
aware of.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-19 Thread Roger Binns
On 18/11/16 15:19, James K. Lowden wrote:
> Good catch, Roger.  It's a liability, but I slightly disagree with your
> characterization.  
> 
>> - Running any Python code (destructors can be called which then run in
>> the parent and child)
> 
> Yes, if those destructors affect shared resources.  The OP did say the
> processes on the other side of popen were C programs.  

You missed a subtlety.  The fork call is made from Python code.  The
parent and child keep running Python code.  Eventually the child calls
exec and similar.  In the child process between the fork call return and
the exec call, the Python code that runs can include the destructors for
Python wrapped SQLite objects.

>> - Not having file descriptors closed so the child process trashes them
>> (close_fds Popen argument is False in python 2.x but True in python
>> 3.x).  
> 
> The child process can't "trash" the parent's descriptors.

Lets say the SQLite database is on fd 3 in the parent process.  That
makes it fd 3 in the child process too after the fork.  And if not
closed, it will be fd 3 in whatever gets execed.  This scenario is the
very first one listed in how to corrupt:

 https://www.sqlite.org/howtocorrupt.html

>> Also python 2.x subprocess module is broken in many ways.
> 
> My foray into Unicode in Python convinced me once and for all that
> Python 3 is the only way to go.  But would you care to elaborate on the
> problems with 2.x subprocess?  

The Python 2.x subprocess module is implemented in Python.  It has race
conditions and other problems when the parent is multi-threaded.  They
aren't theoretical, and affected the project I work on.  When debugging
you find yourself in "impossible" situations.  My memory is a bit fuzzy
now, but it was things like file descriptors between parent and child
not being as expected.

Python 3 includes a pure C implementation of code that runs from fork
through exec.  Someone at Google backported that to 2.x, so they must
have been affected too.

  https://github.com/google/python-subprocess32

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-18 Thread Roger Binns
On 18/11/16 08:55, Kevin O'Gorman wrote:
>> I am not.  All of the python code is a single thread.  The closest I come
> is a few times where I use subprocess.Popen to create what amounts to a
> pipeline, and one place where I start a number of copies of a C program in
> parallel, but each is a separate process with its own input and output
> files.

Ding ding ding.  You didn't mention which version of Python.

Popen calls fork (it seems like you are doing Unix/Mac, not Windows).
fork() duplicates the process including all open file descriptors.  One
or more of those descriptors belong to open SQLite databases and
ancillary files.

If the child process does virtually anything, it will result in crashes.
 Examples of doing things include:

- Running any Python code (destructors can be called which then run in
the parent and child)

- Not having file descriptors closed so the child process trashes them
(close_fds Popen argument is False in python 2.x but True in python
3.x).  Also python 2.x subprocess module is broken in many ways.

There are three methods for addressing this assuming it is the cause.

0: Python 3 is safer if you aren't already using it.  The subprocess32
module for Python 2 backports Python 3 subprocess if you are stuck on
Python 2

1: Make sure that the process that does SQLite work is a "leaf".  ie it
doesn't call subprocess or make any child processes.  Instead a master
process feeds it data on its stdin/out

2: Switch to APSW which has a fork checker built in.  This will catch
the fork problem no matter how it happens, and whatever is going on in
other libraries you may use:


https://rogerbinns.github.io/apsw/apsw.html?highlight=fork#apsw.fork_checker

Disclaimer:  I am the author of APSW

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-18 Thread Roger Binns
On 17/11/16 19:14, Kevin O'Gorman wrote:
> SO: I need help bifurcating this problem.  For instance, how can I tell if
> the fault lies in SQLite, or in python? Or even in the hardware, given that
> the time to failure is so variable?

Are you using threads, threading related settings etc in any way?  The
python sqlite3 module (aka pysqlite) is not threadsafe.  That doesn't
stop people "working around" it, which can lead to crashes.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database malformed after 6000 Inserts?

2016-10-04 Thread Roger Binns
On 04/10/16 03:11, Werner Kleiner wrote:
> ... after 6000 records.
> 
> Is there a limitation with huge inserts?

While there may be "Lite" in the name, SQLite copes very well with
"huge" stuff.

That means many many gigabytes in database sizes, many many millions of
rows, up to 2GB per row etc.  6,000 records probably fits in caches, and
isn't remotely close to huge as far as SQLite is concerned.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] client app crashes frequently and points sqlite DLL

2016-09-02 Thread Roger Binns
On 02/09/16 03:11, Frantz FISCHER wrote:
> I'm almost out of tracks to follow. Any idea on what I could check next?

Are you using threads?

Note that in this case whatever is happening is making SQLite be the
victim.  The cause is almost certainly some other C code in your process.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] The upcoming "pi" release of SQLite

2016-07-24 Thread Roger Binns
On 23/07/16 08:16, Richard Hipp wrote:
> Draft change log:  https://www.sqlite.org/draft/releaselog/3_14_0.html

Please please don't make the new trace/profile API expand the SQL by
default.

There are two problems with expanding by default:

- The text no longer matches what the programmer had in their code.  You
can't grep for what trace reports if it is always expanded.  Heck you
can't even tell if two queries are the same except the bound parameters
without fairly sophisticated parsing.  It is best practise to use bound
parameters and it should be encouraged, so penalising that is unhelpful.

- It consumes considerably more memory and cpu, which makes it less
likely to be something you normally use.  That significantly decreases
the value of the feature versus using it most of the time.

Essentially it is a one way un-reversible conversion of what the
programmer wrote, and hence significantly less helpful.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] builtin functions and strings with embedded nul characters

2016-07-01 Thread Roger Binns
On 01/07/16 05:04, Simon Slavin wrote:
> On 1 Jul 2016, at 10:18am, Rob Golsteijn  wrote:
> 
>> For the tests below I assumed that the intention is that a string ends at 
>> the first embedded nul character.
> 
> I'm not sure that this is the intent.
> 
> The idea that null is a terminating character is something that comes with C.

Under the hood SQLite represents blobs and strings the same way - a
sequence of bytes and a length.  A null is not special and if you use
the C apis then what you get back out is exactly what you put in, nulls
and all.  My test suite and code all verify this on every release.

As Rob has found, some of the SQLite builtin functions are a little
sloppy, where they stop at the length of the bytes, or a null -
whichever comes first.  You'll also find that they quite happily operate
on blobs too.  The similarity of the underlying representation is what
makes that happen.

While SQLite could be fixed, there is also the possibility that some
code somewhere in the billions of SQLite deployed instances depends on
this behaviour (probably even unknowingly).

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] page checksums (was Re: Bad db feature request)

2016-06-29 Thread Roger Binns
On 29/06/16 19:13, Scott Robison wrote:
> Given the nature of VFS, it is trivial* for anyone to create a module to
> provide this very functionality. So you can write it yourself!
> 
> *Not really trivial, but probably not horribly difficult either.

VFS is one way you can't reasonably do it.  The VFS is handed full size
pages, so the checksums would have to be stored somewhere other than the
page.  That leads to a *very* complex implementation.

The encryption extension does something like defining SQLITE_HAS_CODEC
and then gets to use a small amount of each page to store information
about the encryption of that page.  Checksums would fit very well into a
similar implementation.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Roger Binns
On 29/06/16 09:45, Drago, William @ CSG - NARDA-MITEQ wrote:
> Aren't there things like that [checksums] already built in to the hard disk 
> controllers (CRC, Reed Solomon, etc.)?

They are at a different level and can only detect issues in what they
see.  For example SQLite can create a page of data, and then hand it off
to the C library which then hands it off to the kernel which then hands
it off to various filesystem drivers which then hand it off to various
block devices which then hand it off over a bus of some sort to the
storage.  If corruption happens at any point before getting to the
storage then the corrupted version is going to be considered correct by
the storage.

Having checksums at the SQLite level means that SQLite can itself verify
that what it wrote (and went through any number of other layers) is what
it gets back.  Short of extremely robust C libraries, operating systems,
drivers, and hardware, SQLite is the sensible place to add checksums.
The "Lite" bit guarantees that SQLite is not run on robust everything,
but usually on less reliable components.  That is why I am somewhat
disappointed the SQLite team doesn't see value in implementing the request.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bad db feature request

2016-06-29 Thread Roger Binns
On 29/06/16 07:51, Dominique Devienne wrote:
> I wish for the day SQLite has page checksums to detect any such random
> corruption.

Agreed.  The SQLite team rejected doing so:

  http://www.sqlite.org/src/tktview/72b01a982a84f64d4284

> Yes, I know, it's a format change, and will likely slow things down a
> little, but it's worth it IMHO.

Note that it isn't as big a change as you think, and could be done
today.  SQLite already allows a portion of each page to be used for
other purposes, with the big user being encryption.

Roger



signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL: no schema after close

2016-06-05 Thread Roger Binns
On 03/06/16 18:56, Gelin Yan wrote:
>APSW is great, I have used it for years. I want to know whether it
> support pypy. I have switched to pypy for a while.

APSW at its heart is very much a CPython extension and uses that C API
to bind to SQLite.  I did port APSW to pypy a few years ago which
required disabling some things (eg the shell) due to missing APIs in
pypy.  However IIRC they didn't provide enough of the C API and while I
was willing and trying to push things forwards, no one involved with
pypy project was interested.  Try following the build instructions with
whatever version of pypy you are using and see what happens.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] WAL: no schema after close

2016-06-03 Thread Roger Binns
On 03/06/16 08:28, Максим Дементьев wrote:
> Thank you, I'll try this "yet another python SQLite wrapper" in the near 
> future, it looks interesting.

Disclosure: I am the author of APSW.

I recommend looking at the page showing the differences between APSW and
pysqlite:

  https://rogerbinns.github.io/apsw/pysqlite.html

  Note: I suggest using APSW when you want to directly use SQLite
  and its functionality or are using your own code to deal with
  database independence rather than DBAPI. Use pysqlite and DBAPI
  if your needs are simple, and you don’t want to use SQLite
  features.

Roger




signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Podcast with Dr Hipp: SQLite history, success and funding

2016-05-16 Thread Roger Binns
On 16/05/16 10:36, Richard Hipp wrote:
> I find that when you are thinking long-term, it changes your
> perspective on which patches land on trunk.

In addition to your (plural) fantastic work, saying yes/no is probably
by far the most important piece.  There are constant calls for things to
be added or changed, often with very good reasoning and benefits.  We
see it frequently on this list.  Saying no is hard, and isn't something
people like doing.

Yet striking the right balance is difficult.  If you don't do enough,
the project can end up left behind by the times.  And if you do too much
it becomes large, hard to use, complicated etc, often driving others to
make a simpler alternative, leaving the project behind.

Thanks!

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] 2 different SQLite versions inside the same process space

2016-05-11 Thread Roger Binns
On 10/05/16 23:43, Dominique Devienne wrote:
> That explains how to avoid symbol mixups, to have two or more SQLite 
> "instances"
> in a single process, but doesn't address the concern about POSIX locks
> DRH mentioned.
> if more than one of those same-process instances access the same file. --DD

As in your last line, it is only a concern if the same file is accessed.
 The OP clearly said:

  Both Addins have their own separate databases and do not share
  any database connections or anything like that.

So the approach will work well for them.

Roger


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] 2 different SQLite versions inside the same process space

2016-05-10 Thread Roger Binns
On 10/05/16 10:42, Andrey Gorbachev wrote:
> I am a bit worried that the initialisation of 2 different versions of SQLite 
> would interfere with one another. Any advice?

There is a way to do it, and I use it for my Python APSW extension as
the recommended build instructions.  It is especially helpful on MacOS
as system components like Core Data use SQLite, and the loaders tend to
force the system SQLite library to always be loaded.  As a bonus, the
approach below also results in faster code.

What you need to do is create a .c file that near the top has these lines:

#define SQLITE_API static
#define SQLITE_EXTERN  static
#include "sqlite3.c"

That causes all the SQLite non-private symbols to only have the scope of
that .c file, and not leak outside of it.

In the rest of the .c file put your C code that uses the SQLite API, and
it will use the static included version (only).

You are done.  The reason this also gives faster code is that the
compiler can inline the heck out of SQLite code since it knows it won't
be used outside of the compilation unit.  It does sometimes make
debugging interesting though.

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] Incremental backup/sync facility?

2016-05-06 Thread Roger Binns
On 06/05/16 05:32, Stephan Buchert wrote:
> Having the copies of the file synced becomes increasingly tedious
> as their sizes increase. Ideal would be some kind of
> incremental backup/sync facility.

Out of curiousity, would an approach of using multiple databases and
using ATTACH to "unify" them work?  The individual databases could
represent shorter amounts of time (eg a week) and as you state smaller
files are easier to deal with.

Roger


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] SQLite vs GPLed software

2016-05-02 Thread Roger Binns
On 02/05/16 20:00, Richard Hipp wrote:
> I said I won't participate in that debate and I mean it.  The relative
> merits of GPL vs  whatever is *not* the question on the floor.

Apologies for not being clear.  I am in no way debating the merits of
either, nor expressing my opinion on the merits, goals etc.

However I do think that a popularity contest is not fair since one is
very much *not* about popularity (and was clarifying that).  Just like
SQLite doesn't have a goal of being as many lines of code as possible,
or being a networked solution, or trying to make everything else use its
"blessing" (see the top of each source file for reference).

If you are trying to illustrate the effect and pervasiveness of SQLite,
then perhaps screenshots of Android, iOS, Windows, Mac, Linux etc with
arrows pointing to which apps are using SQLite (eg Mail is using it,
Music Player is using it, Browser is using it).

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] SQLite vs GPLed software

2016-05-02 Thread Roger Binns
On 02/05/16 16:42, Richard Hipp wrote:
> But some of the discussion did get me
> thinking about the extent of GPLed software versus SQLite.

Something that needs be made abundantly clear is the GPL is *not* about
popularity.  The GPL is about freedom (think freedom of speech, not
price).  Even then it is about freedom for end users, not developer focused.

The GPL and FSF will do things that are anti-popularity but are
pro-freedom because the freedom is what matters.  Trying to measure GPL
popularity is about as fair as measuring SQLite by lines of code.

Not everyone agrees with their definitions of freedom, and many think
that popularity is important with some concessions on freedom to get
there.  The FSF has done that carefully (eg LGPL - note leading L).

Excerpt from https://www.gnu.org/philosophy/free-sw.html

 8< 

A program is free software if the program's users have the four
essential freedoms:

The freedom to run the program as you wish, for any purpose (freedom 0).

The freedom to study how the program works, and change it so it does
your computing as you wish (freedom 1). Access to the source code is a
precondition for this.

The freedom to redistribute copies so you can help your neighbor
(freedom 2).

The freedom to distribute copies of your modified versions to others
(freedom 3). By doing this you can give the whole community a chance to
benefit from your changes. Access to the source code is a precondition
for this.

 8< 

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] Recent downloads gone? Bad http returns

2016-04-28 Thread Roger Binns
On 28/04/16 12:56, Richard Hipp wrote:
> I intentionally removed 3.12.0 and 3.12.1 because they can (under
> obscure circumstances) generate incompatible databases.  

I have no problems with removing the links to those downloads, but
removing the downloads themselves seems a bit extreme.  Don't many older
releases have bugs and problems too, that are rare?  (This new approach
seems rather arbitrary and inconsistent.)

I'm also not the only one with tools that work with the downloads.  The
tools can't "read" the site, so they won't know and the non-standard
http error handling makes things even worse.  Essentially tools are
suddenly going to start breaking for end users.  If you are going to
remove very recent downloads, please can you at least send a message to
the announce list so those making tools can proactively fix them, rather
than having end users suddenly broken.

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] Recent downloads gone? Bad http returns

2016-04-28 Thread Roger Binns
I can download 3.12.2 as
http://sqlite.org/2016/sqlite-autoconf-3120200.tar.gz but 3.12.1 and
3.12.0 are now gone, giving a 404 error.  URLs were
http://sqlite.org/2016/sqlite-autoconf-3120100.tar.gz and
http://sqlite.org/2016/sqlite-autoconf-312.tar.gz which used to work.

Is this intentional?  I need the downloads available to run my test
suite to ensure compatibility with different SQLite versions.

Separately from that the site is doing something very nasty.  For
example retreiving http://sqlite.org/2016/sqlite-autoconf-31202.tar.gz
(note two fewer zeroes on the end) instead of giving the correct 404
(Not Found) error code gives code 302 (temporary redirect) to a not
found page which gives a code of 200.  That is very much the wrong thing
to do since it looks like a success and that content does exist,
although on closer inspection it is a bunch of html instead of a tar
archive.

Roger

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: 



[sqlite] Understanding table and database locking mechanism in shared cache mode and normal mode

2016-03-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 30/03/16 16:58, Simon Slavin wrote:
> In both modes (whether you're using 'shared cache' or not) use
> either
> 
> https://www.sqlite.org/c3ref/busy_timeout.html

The last time I dealt with shared cache mode, the busy timeout did not
apply.  You had to manually manage the timeout/retries yourself.  This
is by design:

https://sqlite.org/src/tktview/ebde3f66fc64e21e61ef2854ed1a36dfff884a2f

In the APSW doc I recommend against using shared cache mode, as it
doesn't have any benefits except for very small memory systems.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlb8dyEACgkQmOOfHg372QQhhQCbBoKrBu40ZgroyJOPB8WVy4To
hcsAn0f8rx1h+foMBH0r4YVYo3pmc9Nc
=lNHi
-END PGP SIGNATURE-


[sqlite] Reserved column names

2016-03-21 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 21/03/16 03:32, Dominique Devienne wrote:
> SQLite tries to be compatible with non-standard extensions from
> various popular RDBMS', but when a standard alternative exists, it
> should be preferred IMHO. --DD

That depends on the code and project.  In my case the code is not
database independent nor is it intended to be.  (If I wanted that I'd
use or reinvent something like SQLAlchemy.)

The code also depends on the SQLite "dynamic typing" feature - that
the type belongs to the value, not the column or variable it is being
stored in.  This matches exactly how Python does typing as well as the
real world data I work with.(*)

The SQLite API also has progress hooks, a transaction model
(savepoints), backup API and numerous other unique to it features.
When using SQLite I use it to the full extent appropriate.

(*) Please don't derail this about typing.  Dynamic typing and strong
typing are not the same thing, although Python has both and SQLite
mostly only has the former.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbwK1sACgkQmOOfHg372QQK2wCfdoUaHyORGq00BmWAOF4r3rdQ
SNYAnAnMR0EB7Ny38bnqrMcGL+MwAoJU
=DeUQ
-END PGP SIGNATURE-


[sqlite] Reserved column names

2016-03-19 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 19/03/16 03:30, Paul Sanderson wrote:
> I know that keywords can be quoted but I am interested in just
> those that can be used unquoted (even if not advisable)

Out of curiousity, why?

My rule of thumb is to always quote (using square brackets) when the
query is generated by code, and only quote where reasonable when the
query is written by a human.

For the APSW shell the following are all quoted:

* zero length names (yes SQLite allows them)

* if it starts with a digit

* if any non-alphanumeric/underscore present (ascii only alphas ok)

* if in the SQLite list of keywords

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbt2QIACgkQmOOfHg372QRhAQCfc00p/L15AJmx8Zgrr9isuU5H
B3cAnim38/I6S3gNsHQ7WZtJKok+T+sY
=O8pL
-END PGP SIGNATURE-


[sqlite] Changing the default page_size in 3.12.0

2016-03-04 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/03/16 07:48, Richard Hipp wrote:
> The tip of trunk (3.12.0 alpha) changes the default page size for
> new database file from 1024 to 4096 bytes. ... This seems like a
> potentially disruptive change, so I want to give you, the user
> community, plenty of time to consider the consequences and
> potentially talk me out of it.

Can I talk you into it instead :-)  My standard boilerplate for new
databases is to set the page size to 4,096 bytes, and to turn on WAL.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbaCegACgkQmOOfHg372QRaLQCg1jC4d3iCqSTLDqLD4Eqsfh4y
SIEAnizgfhlyyFasZng8QpsSrVo6OpD0
=8zgy
-END PGP SIGNATURE-


[sqlite] applyng schema changes to several databases

2016-02-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 17/02/16 03:38, Luca Ferrari wrote:
> The above syntax> does not look familiar to me,

It is made up to be concise and get the point across.

> and in the meantime I've wrapped the user_version pragma get/set
> into a shell script.

You can't do if statements in SQL (beyond CASE), so a pure SQL
solution isn't reasonable.

While you can (heroically) do everything in shell script, I'd
recommend you use a higher level language for easier testing, clearer
semantics etc.  For example any of TCL, Python, Perl, Ruby, PHP will
meet your needs.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbEuLcACgkQmOOfHg372QTd1gCgh0B5UO/Rb6zEnK3US0+V4oWz
GYIAoJX31S2pg73eTtTsKiBT9whwofF/
=Ximu
-END PGP SIGNATURE-


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 17/02/16 08:20, Dave Baggett wrote:
> One issue, though is that I'd have to run two FTS searches to
> search -- one on the disk-based database, and one on the
> memory-based one

You see issues, I see features :-)

The memory based cache would contain the most recently cached "fresh"
information, and probably what is of most interest to the user.  So
the FTS searches for that will be very fast which is great, while
older information is searched at "normal" speeds.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbEoW4ACgkQmOOfHg372QST2wCdGgvbncjSo4B2FDxAZYQye3E0
TlIAoIKR+X4rhdwSUeOD5TVTchA83GT7
=vMsO
-END PGP SIGNATURE-


[sqlite] SQLite tuning advice for caching scenario

2016-02-17 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 17/02/16 06:37, Dave Baggett wrote:
> I'd welcome any suggestions

How about two databases?  Create an in memory database for the cache.
 Then whenever it hits a certain size (eg 64MB) or time passed (eg 5
minutes), copy/move data from the memory database to the persistent
(disk) one.  This ensures the writes to the disk database are in big
chunks.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbEm9AACgkQmOOfHg372QR9rwCgu+MPM+kJEGYlBTzkKRYhHsOu
U98AoOZ4kBue7MV6Q8P9+vkljyJGobVu
=61/4
-END PGP SIGNATURE-


[sqlite] applyng schema changes to several databases

2016-02-16 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 15/02/16 01:01, Luca Ferrari wrote:
> While I'm pretty sure a simple sheel script that will execute,
> file per file, the alter table (within a transaction) will do the
> job I'm wondering if there's a better approach or a more automated
> one.

The way I (and many others) do it is to use the user_version.  It is
an integer that starts out as zero, and can be read or written with
pragma user_version.  Startup code then looks like this:

if user_version()==0 {
BEGIN;
CREATE TABLE ;
CREATE INDEX ;
pragma user_version=1;
COMMIT;
}
if user_version()==1 {
BEGIN;
CREATE TABLE ;
ALTER TABLE .;
pragma user_version=2;
COMMIT;
}
if user_version()==2 {
BEGIN;
DELETE INDEX ;
CREATE TABLE ;
pragma user_version=3;
COMMIT;
}

This way the schema will always end up as desired, even if the program
terminated while updating a schema.  Even a restored backup will
adjust nicely.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbDi78ACgkQmOOfHg372QSgPACgxIpKdSlzUVznymQThe9aqqJM
pGkAnj99zXJbzO1Tm6/uyuIgXCt5jq42
=Y/zw
-END PGP SIGNATURE-


[sqlite] Why SQLite does not use a web-based forum?

2016-02-14 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/02/16 17:19, admin at shuling.net wrote:
> Why SQLite does not utilize a web-based forum for all users
> discuss problems? I think that will be more convenient and can help
> more people.

Here you go in several different formats:

  http://dir.gmane.org/gmane.comp.db.sqlite.general

Note that is still using the mailing list, so all users can still
participate.

BTW web based forums tend to have terrible usability around discussion
of technical topics due to poor handling of threading.  Pick whatever
works for you.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlbBNi4ACgkQmOOfHg372QSYjACfbCloKBmKZVYdHpVdgHVOcZHP
bssAn1rodsnBCUiBsS0sC39g3j3EdaGY
=f5Ig
-END PGP SIGNATURE-


[sqlite] Use of __builtin_expect in SQLite

2016-02-08 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/02/16 22:39, Matthias-Christian Ott wrote:
> Amdahl's law is not applicable here and describes a completely
> different problem. SQLite does not involve concurrency.

Amdahl's law very much applies, and doesn't explicitly only involve
concurrency.  It is about relating speedups of individual pieces and
how that affects the whole.  For example:

Lets say that processing a representative query involves ten different
places in the SQLite code, and that each one of those takes about ten
percent of the total execution time.  And then lets say
likely/unlikely speeds up one of those by ten percent.  The overall
whole improvement will then be 1%.  To get an overall improvement of
10% each of the ten different pieces would need to get about 10%
faster.  That would be a huge amount of work, and the nature of each
of those places would have be that they could be sped up that way.

> SQLite does not involve concurrency.

http://sqlite.org/pragma.html#pragma_threads  :-)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAla4u9gACgkQmOOfHg372QScKACeKaDcRUmtllIaCtLrvQOXYAoy
tPsAoNH+TKDtsWsE9XeJHTVwKQ24MjJu
=sPJZ
-END PGP SIGNATURE-


[sqlite] Use of __builtin_expect in SQLite

2016-02-07 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/02/16 00:56, Dominique Pell? wrote:
> I'm curious about the outcome on SQLite benchmarks.

About a year ago I tried them out on some tight code (non-SQLite) that
absolutely had to use less CPU time.  I couldn't get them to make any
difference outside the bounds of measurement error.  Since SQLite has
lots of "tight code" places, the instrumentation would have to help in
most of them to make a difference (Amdahl's law).

Taking a step back, the reasons why it had no measureable effect are
simple.  The processors are getting better at branch prediction,
better at mitigating mispredicted branches, getting even faster
compared to memory.  The compilers are getting better all the time too
at the same kind of things.

It takes a more pervasive effort (in the general case) to see
performance improvements, which is what PGO does.  Note that while it
provides branch information, it also provides far more useful
information about what gets called and how much so that the entire
binary can be re-arranged, such as putting hot functions in the same
pages reducing cache pressure which will have a more measureable effect.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAla4DB4ACgkQmOOfHg372QRzyQCg05Z2zyOjqJ58q0jx367wQhqo
+SAAn1AeeydFIdwvJRwh1P+MrSEX1Pkd
=4nwe
-END PGP SIGNATURE-


[sqlite] Slight problem with sqlite3_compileoption_get

2016-01-24 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 24/01/16 13:12, Bart Smissaert wrote:
>>> I think that unless the argument is going to be altered I have
>>> to pass
> them always ByVal. That wasn't quite right, eg: int
> sqlite3_close(sqlite3*); sqlite3* is not going to be altered, but I
> can see that this should be passed ByRef

I was very much right!  Both cases apply here - modification and item
size versus pointer (mailbox number) size.

The type sqlite3 is a structure in C.  I don't know the exact
corresponding terminology in VB6 but it looks like "Type".  A C
structure is very much like this VB:

  Type Customer
Dim FirstName As String
Dim LastName As String
Dim Id As Long
  End Type

ie it is a name ("Customer") and has various items of data as members,
each with their own type (String, Long etc).

The sqlite3 structure has 79 members (more with some optional
functionality) and occupies about 300 bytes.  Even if none of them
were going to modified, it would be insane to require 79
parameters/300 bytes be passed, versus one pointer to the structure.

It is possible to mark things as "not going to be modified" in C.  The
tag used is "const".  The usage of it varies by code base, as it
wasn't in the original C and it is easy to bypass if you want, plus
various other issues.  SQLite does use it.  In this case it would be:

  int sqlite3_close(const sqlite3*);

And logically, how can closing the database not modify at least one of
the 79 members making up the sqlite3 structure?  const was
deliberately left out for a reason.

> It doesn't work though with complex arguments like this: void
> (*xFunc)(sqlite3_context*,int,sqlite3_value**)

It will if you correct that signature.  Elsewhere in the SQLite code,
sqlite3_context and sqlite3_value were both defined (as structs).  The
one liner in isolation is not valid because it has no idea what they
are, although the site could be bit more forgiving.  Try that again
but put "struct" before them, like so:

  void (*xFunc)(struct sqlite3_context*,int,struct sqlite3_value**);

This particular case is a callback function.  Those can get tricky
even for regular C programmers.

I strongly recommend doing one of those online courses about C.  You
don't need to become an expert, but at least understanding the
concepts will make it *far* easier to translate between the VB world
and C.  C is not a big language, and has a lot of simplicity, brevity
and elegance.  Understanding pointers is a good test of a programmer.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlalVcwACgkQmOOfHg372QS51wCglDbd22FVdsA7pmV2uDlqkIZb
j7EAoODoOlnObarBE45EMtAUNf0xw6eR
=4+b6
-END PGP SIGNATURE-


[sqlite] Slight problem with sqlite3_compileoption_get

2016-01-24 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 24/01/16 02:19, Bart Smissaert wrote:
> You didn't mention it but how arguments are called from VB6 is also
> very important, that is ByVal or ByRef. I think that unless the
> argument is going to be altered I have to pass them always ByVal.

Sort of.  You need to look at the C interface.  ByVal makes a copy of
the value and passes that to the function.  The function can do whatever
it wants to the copy as it won't affect the caller's version.  Byref
passes a pointer to the value in memory.  The called function has to
dereference the pointer to get the value at that location.  It can also
modify the value at that location, affecting the caller.  That ties in
with your rule of thumb.

But they are not interchangeable.  Randomly specifying one or the
other and seeing if it works is not a good idea.  Sometimes you do the
wrong one but can get lucky, or more likely crash/corrupt memory.

As an analogy, it is the difference between handing you a photocopy of
a document versus giving you a mailbox number that has a document
inside.  But realise that a mailbox is very different than what is
inside, and it is especially the case that they can be very different
sizes (eg it could be a big package inside the mailbox with a small
number).  C programmers will use ByRef if they want the item to be
modified, but can also do so if the item is larger since a mailbox
(pointer) number takes less space than the larger item.

In C syntax an integer is written as 'int' while a pointer to an
integer is written as 'int *' (the star is typically pronounced as
pointer).  Your rule should be ByVal when there are no '*' and ByRef
when there are.  The rules are non-obvious when you get more
complicated combinations of types and pointers.  Fortunately there is
a site that turn them into English for you.

  http://www.cdecl.org/

Try the following:

  int x;

  int *x;

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlalMEQACgkQmOOfHg372QS1NACgkBqpEHb4q/XxAMgrfBDe/EMj
6+QAn2qDOgHITU8lrm68DiyIC62g06bb
=I6gu
-END PGP SIGNATURE-


[sqlite] Slight problem with sqlite3_compileoption_get

2016-01-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/01/16 17:36, Bart Smissaert wrote:
> What is different though about sqlite3_compileoption_get, so that
> long in the IDL causes the mentioned problems and int doesn't?

Absolutely nothing is different about that API.  If something was then
one of us using it would have noticed by now!

Unfortunately "but the ActiveX can't be registered with Windows."
isn't a useful enough information for us.  You'll need to dig out why
it is giving that message.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlakLQoACgkQmOOfHg372QRa7ACeINkhadBkTMOaDwONJdGkVHfP
wNcAoMercTvhyRrB4ospR4PTg770a6m+
=zLo+
-END PGP SIGNATURE-


[sqlite] Slight problem with sqlite3_compileoption_get

2016-01-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/01/16 13:14, Bart Smissaert wrote:
>> I am somewhat confused about what you wrote.
> 
> This has to do with making a .tlb (type library) to access
> sqlite3.dll from a VB6 ActiveX dll.

That much was clear.

> Sofar I have mapped SQLite int with IDL long  ...

That is where you confused me.  C also has a long type!  The C long
type will either be the same size as C int, or bigger[1].  Windows has
a long legacy, and that is how you ended up in this situation.  On 16
bit Windows, int was 16 bits and long 32 bits.  On 32 (and 64) bit
Windows, int and long are both 32 bits.

Because of the 16 bit legacy of Visual Basic, you'll be getting this
advice about ints and longs and compatibility.  On Win32 where SQLite
says "int" it means a signed 32 bit number.

Note you can get lucky with mismatches.  (I won't bore you with
details about promotion to int, caller vs callee cleanup, how little
endian helps with the luck).

> .. and that is all working fine, except for
> sqlite3_compileoption_get. Instead here int or byte works fine.

File byte under "lucky" above.

sqlite3_compileoption definitely takes a 32 bit integer as its only
parameter.  If "long" in your idl also maps to a 32 bit integer, then
there is something else going on in your diagnosis of "working fine"
:-)  Sadly you'll need to figure that one out.

[1] There are rules about what standards conforming compilers are
allowed to do, there is practise over what they actually do, the
implementors of systems do various things (often for historical or
"compatibility" reasons).  This also spills over into ABIs and calling
conventions (eg exactly how types of parameters and return values are
passed on certain CPU architectures).  The list of considerations go
on and on.  It looks like Keith would be happy to discuss them :-)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlakJLQACgkQmOOfHg372QS7SACfboJV/o1apKA3q5UInT5sOY6/
NUsAn2UbTS1004P5QnpJGRQcCTASMJaI
=LMtI
-END PGP SIGNATURE-


[sqlite] Slight problem with sqlite3_compileoption_get

2016-01-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/01/16 11:41, Bart Smissaert wrote:
> My question is how could I have foreseen this problem, so how could
> I have known that using long is no good here?

I am somewhat confused about what you wrote.  SQLite provides a C
level api.  You can ignore the types and put in alternatives.
Depending on byte order, calling conventions, what the C types
actually mapped to, sizes of types etc it may work when wrong anyway,
or you could get seriously messed up problems.  IOW you are getting
lucky using long instead of int.

> Not from the documentation of sqlite3_compileoption_get: 
> https://www.sqlite.org/c3ref/compileoption_get.html

Could you explain all this again?  As written you seem to want to know
the places where you can ignore the types the C SQLite uses, and
substitute others you feel like using instead.

The answer is don't do that, if you do do that you'll have an
experience anywhere from getting lucky to subtly corrupting things to
crashes.

> Could I see it from the sqlite3.c source? Should this be documented
> somehow better?

The function name, argument and return types in C are generally
referred to as its prototype.  These are usually listed in header
files (extension of .h) and if you look in the SQLite distribution
you'll find a sqlite3.h that contains them all.

However that file is over 400kb long (there are a lot of comments and
supporting information).  The SQLite C function doc is more accessible
and includes the same information:

  https://www.sqlite.org/c3ref/funclist.html

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaj6NgACgkQmOOfHg372QTEtgCgxcMpQxsQDEfKqcUkCBS+Mvhw
tuAAnRg10CKwOUmJQgvqpPGTwhK5+KbE
=BZ6P
-END PGP SIGNATURE-


[sqlite] Incompatible change in unix vfs xCurrentTime

2016-01-22 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 20/01/16 20:11, Roger Binns wrote:
> SQLite 3.10 changed the xCurrentTime (note not -64 version) method
> for From the documentation it is hard to tell who is at "fault"
> here. xCurrentTime is documented as optional now, but was it when I
> first wrote the code?  It certainly has been present in the "unix"
> vfs for many years.

I used the wayback machine to check, and certainly SQLite has now
changed in an incompatible way, and will cause null pointer access if
you replaced an earlier DLL with the current version (ie no code
changes etc).  And yes it happened to me.

The concept of "shims" - ie one vfs calling another is documented, and
hence can be reasonably considered an endorsement of the practise:


https://web.archive.org/web/20110626043941/http://www.sqlite.org/vfs.html

The xCurrentTime method is not marked as optional:


https://web.archive.org/web/20100612193126/http://sqlite.org/c3ref/vfs.html

I don't think it is unreasonable that the default sqlite vfs on a
platform has to have the xCurrentTime method.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaiggEACgkQmOOfHg372QRUZgCeIBfFpjM4JWOwtTH2Pr9homwg
bSYAoLsNzswKudUEtiMhc99V0y2PmBKl
=A/pM
-END PGP SIGNATURE-


[sqlite] Incompatible change in unix vfs xCurrentTime

2016-01-20 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

SQLite 3.10 changed the xCurrentTime (note not -64 version) method for
the "unix" VFS from a function to a NULL pointer.  This has broken
things for me, because I have a VFS that calls back into the "unix"
vfs.  A SQLite shared library upgrade will now cause a null pointer
access!

- From the documentation it is hard to tell who is at "fault" here.
xCurrentTime is documented as optional now, but was it when I first
wrote the code?  It certainly has been present in the "unix" vfs for
many years.

NB I actually call back into the default vfs, rather than "unix"
specifically.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlagWmcACgkQmOOfHg372QTAzQCfXQZ4HN17tEri2iDE5xyttEaW
6Q8AnRDvkgYjNZgGmT+StBH8epIz5rbh
=C0S4
-END PGP SIGNATURE-


[sqlite] Using sqlite3.exe as a subprocess

2016-01-16 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 15/01/16 13:16, Matthew Allen wrote:
> Yeah in hindsight it wasn't the best. I was trying to do:
> 
> while still running: p.communicate etc
> 
> Poll is not the right method to call.

It especially is the wrong method in your scenario since you want to
dynamically read and write from the subprocess.  You'll want
stdin/out/err all to be subprocess.PIPE.

You'll possibly get lucky with the SQLite shell, but it is a *very*
good idea to do the reading of their stdout and writing to their stdin
in different threads.  The reason is that many of these tools have an
loop that looks like this:

  repeat:
- write prompt to stdout
- read a command from stdin
- write output to stdout, errors to stderr

If the output is bigger than the pipe buffer size, then the "write
output" step blocks until there is space in the pipe, and will do so
before reading the next command.  Remember that you don't know when
the output is done - in theory you could try to detect the prompt and
hope that something similar is not in the data, but that is brittle.

If you look at the implementation of the communicate method, you'll
see it addresses this issue by using multiple threads (~one per pipe
of interest).

Since you are using Python 2, another issue you need to be aware of is
that the subprocess module is buggy when your python code is
multi-threaded.  This issue affects non-Windows systems (eg Linux and
Mac) and you can end up with the wrong child process being accessed!
It isn't theoretical - happened to me two weeks ago and was a pain to
realise what had happened and debug.  The good news is Google have a fix:

  https://github.com/google/python-subprocess32

But finally I'd recommend you just avoid all this subprocess stuff.
SQLite will quite happily take commands from a script.  Consequently
you can create a script file like the following and have SQLite
execute it (command line parameter):

   .bail on
   .open c:\path\to\your\db
   .output c:\...\tmp\dump.sql
   .dump
   .exit

You can use the tempfile module to get a temporary filename.

The script approach avoids many text vs binary, quoting, encoding,
testing and other issues.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaajaQACgkQmOOfHg372QQ68QCeJuZzafMJMAju3F7twPbyToAM
XOcAnic84HeEtyTIg/u0r7gbRHt/x/+o
=Xyid
-END PGP SIGNATURE-


[sqlite] Using sqlite3.exe as a subprocess

2016-01-15 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/01/16 19:53, Matthew Allen wrote:
> It seems that sqlite3.exe (console) doesn't work as a subprocess
> with pipes.

There is a bit of a problem with using apps via pipes.  Generally when
stdout is a terminal, output will be line buffered (ie you get each
line from printf as \n is encountered).  However when output is not a
terminal then other buffering kicks in.  For example it may be in
blocks of 4kb, so you'll only see something every time that much has
been generated.

The Windows standard library is even a bit stranger when not connected
to a terminal.  For the first 512 bytes of output it will send them
immediately, and then switch to block buffers.

There are solutions available to try and "trick" the apps to believing
they are outputting to a terminal, when it is in fact a pipe.  However
you won't need them (but shoutout to Expect - a populariser of TCL).

> while p.poll() == None: resp = p.communicate() print len(resp[0]),
> resp[0]

That code doesn't make sense.  communicate waits until the process
terminates.  The SQLite shell won't terminate unless it gets a quit
command, or EOF on stdin.

> The problem I'm trying to solve is: When my application that uses
> an sqlite3 database gets the "database disk image is malformed" I
> need to be able to give the user a "repair" option which dumps the
> datrabase to an .sql file and reimport it all. I'm assuming the
> best way is to do that via the shell rather than try and copy all
> the dump code into my own application.

Good news - here is a shell in Python I already made for you:

  https://rogerbinns.github.io/apsw/shell.html#shell-class
  https://github.com/rogerbinns/apsw/blob/master/tools/shell.py

It does require APSW for the database access, as the standard sqlite3
module lacks various functionality.

  https://rogerbinns.github.io/apsw/pysqlite.html

You can add your own repair command based on the existing dump
command.  This shell aborts on error. The way the standard SQLite
shell handles errors (IIRC) is to scan a table forwards (rowid order),
and then on encountering the error scans backwards.  This is a best
effort, but doesn't mean you won't lose lots of data!

However I'd recommend you use the backup api and make periodic copies
of the database that way.  You can then offer going back to a previous
snapshot.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaZU68ACgkQmOOfHg372QTvegCgpF/pck6KCjdOqDKhxl5XEyuA
cFYAoMdJwpDo5Pwg2uRr/RbNYmEhtdz1
=AR0i
-END PGP SIGNATURE-


[sqlite] Database Corrupt While Disk Full

2016-01-14 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 14/01/16 00:44, sanhua.zh wrote:
> Recently, my monitoring system showed that the error code
> SQLITE_FULL and SQLITE_CORRUPT increasing in same trend.

Just as another data point, I had SQLite using code in a library that
was used across a bunch of third party apps, with many millions of
installations on Android, and to a lesser degree on iOS.  (I stored
analytics events in a SQLite database and periodically uploaded them
in batches.)

Although there were Android devices (the really cheap and nasty ones)
that kept running out of space, not once was there corruption.  I was
very careful to use transactions, and pay attention to the return code
of API calls.

SQLite is unlikely to have problems:

  https://sqlite.org/testing.html

> This is happened in iOS. The OS will automatically clean the disk
> cache to make some space while disk free space is low

Android does the same.  One thing you have to be *very* careful about
on iOS (and current Android) is exactly where you place the database
files in your app sandbox.  Most of the directory tree is backed up,
but you don't have direct control over when that happens, or the restore.

If you put the database files in a location that is backed up and
restored, then it is possible for the database and its journal to be
inconsistent with each other.

https://developer.apple.com/library/ios/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileSystemOverview/FileSystemOverview.html

Using WAL will make you less sensitive to these issues.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaXxrQACgkQmOOfHg372QRqzwCfT+Mv9QjLVccydsj0c5wuKq9H
bdIAmQF2BFMzJqXXyxSgvdDeVt8/3j3y
=3tnu
-END PGP SIGNATURE-


[sqlite] Incorrect missing download behaviour on SQLite.org website

2016-01-10 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

If you request a download that doesn't exist, the SQLite website
behaves in a very non-standard way.

For example the current download should be from a 2016 directory, but
if you instead request from a 2015 directory, you instead get HTTP 302
(temporary redirect) to /not-found.html on the site, where the latter
gives code 200.

To correct this, the downloads should give 404 directly, or the
/not-found.html page should be returned with code 404.  The current
behaviour turns something that doesn't exist into something that does!

$ wget -S http://sqlite.org/2015/sqlite-autoconf-310.tar.gz
- --2016-01-10 15:52:01--
http://sqlite.org/2015/sqlite-autoconf-310.tar.gz
Resolving sqlite.org (sqlite.org)... 67.18.92.124,
2600:3c00::f03c:91ff:fe96:b959
Connecting to sqlite.org (sqlite.org)|67.18.92.124|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 302 Temporary Redirect
  Connection: keep-alive
  Date: Sun, 10  Jan 2016 23:52:01 +
  Location: http://sqlite.org/not-found.html
  Content-length: 0
Location: http://sqlite.org/not-found.html [following]
- --2016-01-10 15:52:01--  http://sqlite.org/not-found.html
Reusing existing connection to sqlite.org:80.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Connection: keep-alive
  Date: Sun, 10  Jan 2016 23:52:01 +
  Last-Modified: Wed, 06  Jan 2016 19:05:04 +
  Content-type: text/html; charset=utf-8
  Content-length: 5189
Length: 5189 (5.1K) [text/html]

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlaS78oACgkQmOOfHg372QTdCgCeMLN8DMNDTIqw5Leusgjf0OwS
OzkAoMZzESmUE7NJL6sQM2dNaAch+D9/
=lgSE
-END PGP SIGNATURE-


[sqlite] [AGAIN] SQLite on network share

2015-11-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/13/2015 11:55 AM, A. Mannini wrote:
> About VistaDB it support use on network share look at 
> http://www.gibraltarsoftware.com/Support/VistaDB/Documentation/WebFram
e.html#VistaDB_Introduction_SupportedPlatforms.html
>
> 
and confirmed from its support. Unfortunately i have not experiences
> with iti can't say how much this is true...

They don't list any supported network filesystems.  Those various
combinations of Windows they list speak different versions of SMB to
each other.  Some aren't even supported by Microsoft any more.

And they don't actually say what they mean by "support".  They also
don't appear to provide any tool that lets you do the certification
yourself.

You should understand where we come from in the SQLite world.  Data
integrity matters.  This is how things are tested:

  https://www.sqlite.org/testing.html

- From that vantage point, other vendors can look sloppy :-)

How about a tale from the past.  I used to work on WAN optimizers.
They intercept network traffic, compress it to reduce bandwidth, and
cache plus write behind to reduce latency.  One of our competitors had
a shoddy implementation, that for example would paper over errors,
incorrectly cache (or not flush cached) information and various other
things.  These conditions weren't hit too often, while the bandwidth
and latency improvements were very noticeable.  On talking to sites
that had the competitor devices, we'd find they did notice increases
in programs crashing and data file issues, but had written it off as
the kind of thing that happens with Windows.  ie expectations on data
integrity for Windows is already pretty low, even though it wasn't at
all at fault.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZGU+UACgkQmOOfHg372QQQeQCfVIgWO1n/X7x9A0mUkMzRTvp8
9aUAn1Ma2DLPaGoQ3c9+9mIo02kGfXXR
=arIX
-END PGP SIGNATURE-


[sqlite] [AGAIN] SQLite on network share

2015-11-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/13/2015 10:46 AM, A. Mannini wrote:
> 1) is there a list of FS where SQLite works fine?

I don't know of any.  Network filesystems are very hard to implement
(so many corner cases), and there is a lot of complexity if you also
want them to be performant.

> 2) why there are SERVERLESS database (MS Access or VistaDB) that
> works without FS restrictions?

Vendor snake oil.  Even a poorly implemented one will appear to work.
 As far as I can tell, Access does not have checksums or similar data
integrity measures in its file format.  Consequently, how would you
even know?  Here is a random page (possibly selling a "solution")
describing how Access gets corrupted, especially on a network.  Note
how they say same stuff we say about using SQLite over a network:


http://www.everythingaccess.com/tutorials.asp?ID=Access-Database-Corrupt
ion-Repair-Guide

I don't see any mention on the VistaDB pages that they support
networked filesystems.  Heck they claim to run on anything (eg also
Mono), which by definition means they can't have tested every
possibility.  Unless a vendor can provide a guarantee of some kind, or
at the very least some certified configuration, I wouldn't trust it
either.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZGNkMACgkQmOOfHg372QQdNQCfVECWQymsAzgikzQkuBjm01R/
rR4AnjyoSAfKBcF8hG3MxC2YXdNO0XWp
=UtWN
-END PGP SIGNATURE-


[sqlite] [AGAIN] SQLite on network share

2015-11-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/13/2015 10:31 AM, Richard Hipp wrote:
> Just be warned that there are many network filesystems that claim
> to implement locks correctly, and do most of the time, but
> sometimes mess up

It is also worth mentioning that SQLite trusts the filesystem 100%.
SQLite does not verify that what it thought was written out, is in
fact the same as what just got read in[1].  Consequently it could be
quite a while after corruption has happened before it is detected or
effects found.  Since SQLite doesn't keep redundant copies of
information[2],
you are unlikely to recover everything or even know what is missing/wron
g.

[1] Some sort of checksumming mechanism would help.  It got rejected:
 http://www.sqlite.org/src/info/72b01a982a

[2] Indexes are the exception, although recovering information from
them isn't particularly practical

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZGMC0ACgkQmOOfHg372QRb7QCeJOIRGKRWY0lFFYzz8Fn+8l6L
IeUAoKzyOO51ldK6xm2f3XK9PuzUTuRG
=wvUQ
-END PGP SIGNATURE-


[sqlite] Feature Request: Support for MMAP beyond 2GB

2015-10-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

https://sqlite.org/mmap.html

SQLite supports memory mapping databases, but only does so for the
first 2GB of the file.  My databases are around 4 to 20GB, and
completely fit within RAM on my 64 bit systems.  The 2GB mmap limit
means that only a portion of the file benefits from the improved
performance, and much effort is wasted copying the data around bits of
memory.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlYzvnsACgkQmOOfHg372QTLyQCg2Hbf8V/4xPt7OA6s0bK6U7Ob
Qp4An1LOw8nl9DAHoK07ykY+DIFaa/jS
=iTb4
-END PGP SIGNATURE-


[sqlite] Question about Style

2015-10-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/30/2015 11:08 AM, Ramar Collins wrote:
> I'm working on a static site generator and want to use SQLite to
> store metadata.   I'm using C and a small library to get the
> majority of the work done.   My question is, do you have any
> suggestions or know where to find more lore on how to nicely embed
> SQL in a program like this?

Your C code already has some nasty problems, in particular buffer
issues.  I believe some snprintf implementations don't null terminate
if there would be overflow.  If the values had any single quotes in
them then the query would be invalid. (And on dynamic sites would be
exploitable).  These issues would be greatly solved by using this:

  https://www.sqlite.org/c3ref/mprintf.html

> The example is not nearly complete, but I'm almost certain there's
> a cleaner way to seperate the SQL from the rest of the code.
> Before I go come up with my own thing, I wanted to see if there
> perhaps some better solutions already out there.

I strongly recommend you don't use C for this.  The dynamically typed
scripting languages (eg Ruby, Python, Perl, TCL, PHP) all have decent
ways of handling databases and lots of strings (both C weaknesses).

If you still really want to use C, then write your implementation in a
scripting language (which will be a lot quicker), then write a test
suite, and finally re-implement in C.

If despite all that you still insist on C only, then have a look at
the Fossil SCM project.  It is written in C and is primarily by the
SQLite team.  Consider it some of the best practises for the combination
. 

  http://fossil-scm.org/index.html/dir?ci=tip

However you'll note that it too uses a scripting language internally
in places (TH1).

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlYztk8ACgkQmOOfHg372QTbvwCfZlkteYutSqRjZaT70WffQTUB
b+8An21W3sump5FT1lioNCJjoIwSRzqu
=ejKz
-END PGP SIGNATURE-


[sqlite] SIGSEGV in sqlite3_open

2015-10-28 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/28/2015 01:48 AM, Lohmann, Niels, Dr. (CQTN) wrote:
> #0  0x010385c4 in SignalProcmask_r () from
> C:\QNX650\target\qnx6/armle-v7/lib/libc.so.3

Have you tried turning off all compiler optimisations?  Some
older/embedded system compilers can produce crashing code.

> #8  0x78991330 in ?? () from libsqlite_shared.so Cannot access
> memory at address 0x329457c

One possible way this can happen is if shared libraries get unloaded.
 I experienced this in the past with an Apache module where it got
loaded to parse its configuration, then got unloaded, then reloaded to
actually use it.  Address space randomisation meant it got loaded at a
different address.  Things got very interesting trying to debug what
was going on.

> Meanwhile, we found out that replacing "file::memory:?cache=shared"
> by "file::memory:" may avoid the problem. We have not tested it
> thouroughly. What do you think?

The chances of it being an SQLite issue are *very* slim, but not
non-existent.  SQLite is used all over the place (many billions of
apps), and they all use sqlite3_open, although way fewer use the
shared cache.  Also https://www.sqlite.org/testing.html

This is far more likely an issue with the platform (QNX 6.5 is not the
most recent release), compiler or your app itself.  Stray writes from
elsewhere in the app code can easily cause problems within SQLite
which does a lot of checking.  A strong memory checking tool like
Valgrind makes them relatively easy to find.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlYw/voACgkQmOOfHg372QS2CwCghtnlWi00gtYawiuDX+M4/jMp
dX4AnRbX1vGnFzz45QUwZOO6WIgnRrOw
=C6a8
-END PGP SIGNATURE-


[sqlite] SQLITE_CANTOPEN after days of operation

2015-10-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/11/2015 12:56 PM, Andrew Miles wrote:
>>> 5) I ran lsof on the db, only one process (this one) had the
>>> file open

Have you run lsof on your monitoring process?  You may be running out
of file descriptors that are accessing other files or network connection
s.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlYdV58ACgkQmOOfHg372QSn8gCeMB+0Jcl+XtxDW8SsDHhifAQi
SwwAoMQ09NZ2kDwcCV4QZlKjEPjTzIIP
=mgxT
-END PGP SIGNATURE-


[sqlite] Potential bug: Database on gvfs mount cannot be committed to

2015-09-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2015 11:13 AM, Florian Weimer wrote:
> Surely that's not true, and NFS and SMB are fine as long as there
> is no concurrent access?

And no program crashes, no network glitches, no optimisation in the
protocols to deal with latency, nothing else futzing with the files,
no programs futzing with them (backup agents, virus scanners etc), the
protocols are 100% complete compared to local file access, the
implementation of client and server for the protocol is 100% complete
and bug free, the operating systems don't treat network filesystems
sufficiently different to cause problems, you aren't using WAL, and
the list goes on.

In other words it can superficially appear to work.  But one day
you'll eventually notice corruption, post to this list, and be told
not to use network filesystems.  The only variable is how long it
takes before you make that post.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlXsnwUACgkQmOOfHg372QQczQCffHo7JLJtQw4EcJQAVemsTPEN
/CUAoKpoIz3RudoRWM5qc2ac98dTVa18
=3Yx3
-END PGP SIGNATURE-


[sqlite] Potential bug: Database on gvfs mount cannot be committed to

2015-09-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2015 10:20 AM, Markus Weiland wrote:
> I see. Since this was working under Ubuntu 14.04, I assume this is
> a regression with gvfs. I'll check over there.

Nope.  SQLite can not maintain data integrity when used with *any*
network filesystem.  Sometimes it is able to detect problems, and give
an error message.  And other times you will end up with database
corruption.  Then you'll post to this list, and we'll tell you not to
use network filesystems.  The only variable is how long it takes
before you eventually find out about the corruption.

Sometimes people also come up with various schemes that appear to
work, but they won't always. Even something as simple as opening the
database for reading only can be very complex - such as if the
database wasn't cleanly committed (eg program crashed) - because
journals have to be examined and played back/undone as appropriate.
That involves two or more files, various forms of locking etc - just
the things that are problems with networked filesystems.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlXsncIACgkQmOOfHg372QTweACeORy+Jpo6V4LDY3NXvU0iZ7G4
nW4AnA1ugTztFbRyQOHzhMJsQ8b54F7T
=uuQw
-END PGP SIGNATURE-


[sqlite] Potential bug: Database on gvfs mount cannot be committed to

2015-09-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 09/06/2015 06:16 AM, Markus Weiland wrote:
> I've discovered a potential bug in handling of SQLite database
> files on gvfs mounted network shares.

SQLite doesn't support being stored on the network for several
reasons, including that network file protocols don't implement fully
and correctly various access and locking, and that some journaling
modes like WAL require shared memory and hence can't be networked.
This is covered in the FAQ:

  https://www.sqlite.org/faq.html#q5

Or if you want to be pedantic, it is gvfs and networked filesystems
that have the bug.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlXsTFcACgkQmOOfHg372QQyeQCeJW2PjkZmQQ5jGjAhkI464TTg
zEAAn3mG2H9VjACQHRN8lxQ70itB4FcD
=tl3B
-END PGP SIGNATURE-


[sqlite] ESC_BAD_ACCESS when using a background thread

2015-09-01 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/31/2015 11:28 PM, Jeff M wrote:
> All my bad -- I'm fessing up.

Can you tell us how you found the root causes of the problems?  It
would be nice to know what tools and techniques worked.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlXl1FYACgkQmOOfHg372QQSBgCeMDdsTmoipopp2C/gtHX1QRVH
TIQAn3IyjrGYevJYpy10D4UPI/4F4MZ0
=yWfk
-END PGP SIGNATURE-


[sqlite] ESC_BAD_ACCESS when using a background thread

2015-08-25 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/25/2015 02:38 AM, Jeff M wrote:
> I tried to install valgrind (on Mac OS Yosemite), but I can't get
> it to compile.  I don't have the fortitude to work out the install
> issues.

Generally you would be better off using something like homebrew:

  http://brew.sh/

It will take care of dependencies, updates etc and knows how to
compile things.

> Looking at Xcode's memory report (and using Instruments),

Instruments isn't too bad, but IIRC does not do the same thing as
valgrind.  Instruments is a lighter less thorough functionality.
Simon gave lots of links.

> ... this app is pre-ARC

Ouch.

> Among other things, I see many persistent 4K memory blocks
> resulting
from sqlite3MemRealloc

Many of those are likely to be the SQLite cache.  Only closing the
connection will free them.  (SQLite does have a boatload of custom
memory management options, but your underlying issue is very unlikely
in SQLite itself.)

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXcpHcACgkQmOOfHg372QTuEgCgjYogcxmCcINsHDj06EUKE8zj
j9oAn1PHfGnz93q9BBXm8LxIoP8I0o2S
=FIdc
-END PGP SIGNATURE-


[sqlite] ESC_BAD_ACCESS when using a background thread

2015-08-24 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/24/2015 03:08 AM, Jeff M wrote:
> I've checked all of your suggestions and nothing is amiss.

You ran valgrind and it said everything is fine?  That would be shocking.

> I don't understand how the main thread can run before the
> background task has completed -- and specifically while sqlite_step
> is still executing.

If the database is busy/locked then sqlite_step can relinquish the
mutex, call the busy handler and try again.

> Apparently, sqlite_step() can relinquish control to the main
> thread.

SQLite does not contain a thread scheduler (we'll ignore pragma threads).

You've either got 100% correct code in which case the underlying
issues are to do with concurrency (need to check memory barriers,
marking as volatile etc), or there is a bug in the code related to
object/pointer lifetimes or similar.  With the latter changing things
changes where collateral damage happens, but doesn't actually fix the
issue.

It is almost certainly the second case.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXbSzwACgkQmOOfHg372QR/XACfV5Iw0umJ1smYygE/Komcemx0
46gAnjumLvNw1/fj2uFpIEdTwYnmhlOd
=HcCt
-END PGP SIGNATURE-


[sqlite] ATTACH DATABASE statement speed

2015-08-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/23/2015 06:48 PM, Simon Slavin wrote:
> Paolo is using Linux which does not do read-ahead optimization like
> some versions of Windows.  Therefore if he really is using an SSD
> then fragmentation is not an issue.

You are confusing things.  The data structures used in a copy on write
filesystem (as both btrfs and zfs are) are what gets fragmented.  This
is not like a traditional filesystem that will update existing data
structures.  It is more analoguous to garbage collection.

I promise you that it really does become an issue, even with an SSD.
When a file is across ten thousand fragments, latency and throughput
suffer.

> I have seen fragmentation make something take twice as long.

You are talking about traditional fragmentation, not a degenerate
state for copy on write based filesystems.

> My suspicion here is that there's a design fault in ZFS.

Unless zfs does automatic defragging, it will have exactly the same
problems as btrfs.  This is inherent in how the data structures are
laid out, and that nothing is modified in place.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXakBgACgkQmOOfHg372QTepACffpEZ/tozxJKv0bKgZQ0D0wIL
HqUAn3ES+b+xr/c8h7I/lqJs1zhQRVrg
=+S02
-END PGP SIGNATURE-


[sqlite] ATTACH DATABASE statement speed

2015-08-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/19/2015 05:56 PM, Paolo Bolzoni wrote:
> I left running the pragma quick check during the night and finished
> in 2 hours and 46 minutes, so it is about 8 times slower than in
> ext4. Zfs is an advanced filesystem plenty of features, but this
> speed difference is too much I think.

I use btrfs which like zfs is also a copy on write filesystem.  It is
possible for the files to get very fragmented which can result in
dismal performance, even on an SSD even for reads.  Random small
writes especially aggravate this.  btrfs has an autodefrag option that
addresses this in the background, and SQLite is specifically mentioned
as all the browsers use it behind the scenes as do many email clients.

  https://btrfs.wiki.kernel.org/index.php/Gotchas  (2nd last section)

The filefrag command may be helpful if implemented for zfs and will
tell you if fragmentation is a problem.

Virtual machine images are another problematic file type with similar
read/write patterns to SQLite.

Copy on write filesystems don't modify existing (meta)data, but rather
write new versions that point to the existing data for bits that
aren't changed.  Repeat this many times and the chains of pointers get
very long, which is the fragmentation.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXadDsACgkQmOOfHg372QRdpACfRUmHD4hXfAx6+il0q/7lINxd
9lwAoNAyYV1oa5cYcn1O00JxW4nsI/Sp
=VrTq
-END PGP SIGNATURE-


[sqlite] ESC_BAD_ACCESS when using a background thread

2015-08-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/23/2015 03:31 AM, Jeff M wrote:
> sqlite3_step();   // occasionally crashes here (showing
> ESC_BAD_ACCESS on main thread)

That has three very likely causes.  The first is that your internal
state gets messed up, and the statement has actually been
finalized/freed.  ie it is junk memory.  You can add assertions that
check it is still a known statement pointer by checking this returns it:

  http://sqlite.org/c3ref/next_stmt.html

The second is that the memory where you stored the statement pointer
is what is trashed.

The final cause is that some other code has memory bugs, causing
damage to SQLite's data structures.

> It's not a zombie object issue (tested with NSZombieEnabled).

Sadly that only checks Objective C objects, and not all memory.

> Any ideas on how to debug this?

I used valgrind running the app in the simulator.  (I also configure
valgrind to never actually reuse memory.)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXaXpEACgkQmOOfHg372QRKZACfWyT6pEyNQ9sEKPbhFQ4pI/5G
Nh0AniO5ESx9CIbB484/gYqjtfCsGUrM
=Op+8
-END PGP SIGNATURE-


[sqlite] Site unavailable.

2015-08-11 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/11/2015 04:18 PM, ??? ??? wrote:
> In Russia when I'm trying to visit sqlite.org site, I'm getting
> conection timeout.

Use down for everyone or just me to check sites:

  http://www.downforeveryoneorjustme.com/sqlite.org

In this case it is you.

> Also, sqlite.org doesn't pinged.

$ ping sqlite.org
PING sqlite.org (67.18.92.124) 56(84) bytes of data.
64 bytes from sqlite.org (67.18.92.124): icmp_seq=1 ttl=48 time=41.4 ms
64 bytes from sqlite.org (67.18.92.124): icmp_seq=2 ttl=48 time=40.6 ms
64 bytes from sqlite.org (67.18.92.124): icmp_seq=3 ttl=48 time=42.1 ms
^C

You likely either have a transient issue, or there is some sort of
filtering going on at some level.  It has been known for various
content based firewalls to incorrectly decide sites like sqlite are
inappropriate.  However they should generally tell you they blocked
the access.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXKioQACgkQmOOfHg372QQsUQCg156vPDGVqz4iqUNz31s+QNY8
9ScAniH0soJC+LiKEC+c3k8Xiw9jq6JI
=EYWs
-END PGP SIGNATURE-


[sqlite] how to detect when a table is modified ?

2015-08-07 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/07/2015 04:23 PM, Nicolas J?ger wrote:
> I'm writing a program using sqlite3 and gtk to screen a database,
> there is a way to get a signal from sqlite3 when something is
> modified inside a database (values, add/delete row...), in the way
> to update what the program has to screen ?

Is the database being modified from within the same process, or from
outside?

The update hook will tell you (most of the time) about changes within
the same connection (ie if you are making the changes):

  https://sqlite.org/c3ref/update_hook.html

The data_version pragma can tell you if something has changed, but
you'll need to poll:

  https://www.sqlite.org/pragma.html#pragma_data_version

You can use change notification of the file system to know when the
database file is changing and then do the pragma polling to find out
when the change is complete.  For example on Windows the API starts
with FindFirstChangeNotification and on Linux you'll find i/dnotify.
Do make sure to monitor the main database file, as well as any other
related filenames such as wal, journal and shm.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXFTvcACgkQmOOfHg372QSjZwCfZkGNxETrd5MSdx41Wks9/wGh
670AnA7I7jQxlKMe2rMkVTraR/m7PYag
=Uu+0
-END PGP SIGNATURE-


[sqlite] MMAP performance with databases over 2GB

2015-08-07 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/07/2015 02:55 AM, Dan Kennedy wrote:
>> http://www.sqlite.org/src/info/3a82c8e6cb7227fe
> 
>> Does that improve performance any in your case?

I tested 3a82c8e6 (fix above) against e596a6b6 (previous changeset),
in both cases starting with an empty database, on tmpfs and with WAL.
 Three kinds of data are imported into the database, but it also means
the first kind fits mostly within 2GB.  I stopped the third kind
import at 48 minutes in both cases.

3a82c8e6 (with mmap change)
- ---

2m52s  22,821 per second
2m56s   4,823 per second
47m56s  1,157 per second 3.3 million records imported of this kind


e596a6b6 (without mmap change)
- --

2m51s  22,855 per second
3m43s   3,800 per second
47m54s462 per second 1.3 million records imported of this kind


Your change definitely helped once the database got above 2GB (I don't
track exactly where that change happens in the import process - looks
like in the second kind.)

It would also be really nice if there wasn't a 2GB mmap limit on 64
bit machines.  The database would fit in my RAM around 4 times, and in
the address space more times than there are grains of sand!  Yea I
know this isn't very Lite ...

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXFLlkACgkQmOOfHg372QQStwCfXXQhxJsyfJWUq/hOUm2KYdbs
aPoAoLOHYbBn7CItwbmASG5igPeeeXpl
=f1gz
-END PGP SIGNATURE-


[sqlite] Determine query type

2015-08-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/06/2015 09:46 AM, Ben Newberg wrote:
> Is there a good way to determine if a sql query is either
> attempting to modify a database, or simply querying it?

You most like want to use the authoriser.  It tells you exactly what
the query is doing, and you can block/render harmless/approve the
various actions making it up:

  https://sqlite.org/c3ref/set_authorizer.html

A very brief example of what you get called back with is here:

  http://rogerbinns.github.io/apsw/example.html#authorizer-example

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXDm8oACgkQmOOfHg372QTuBACcCD99VOOBscT6YOHsE4jux3Zc
lSYAoMksAQcHqieUHsdWniNqXnOFYJ4v
=IG84
-END PGP SIGNATURE-


[sqlite] MMAP performance with databases over 2GB

2015-08-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/06/2015 09:27 AM, Dan Kennedy wrote:
>> Is it using more CPU cycles in mmap mode or just taking longer?
>> If the former, does [time] attribute them to "user" or "system"?

It is taking longer.  I have 3 XML dumps which I turn into JSON
(incrementally) and then denormalise and insert into SQLite across
several tables.  While all this work is going on, the code prints out
statistics about how long it is running and about how many records per
second are being processed.  The final database size after commit and
vacuum is ~8GB.  There are a lot of foreign keys too, although all
simple (referring to an INTEGER PRIMARY KEY column).

I gave mmap a try - ie the *only* change was to add an extra pragma
before the beginning of the transaction:

   "pragma mmap_size="+str(2*1024*1024*1024)

In hard numbers, without that line I was doing ~1,118 records per
second and with it it does ~300.  A normal run takes about 1h20m but
the mmap one was still running 3 hours later when I aborted it.

(BTW this is all on a tmpfs filesystem on 64 bit Linux with swap
spread across two ssds, and 32GB of ram.  ie the actual storage
hardware isn't a factor.  Also single threaded because XML.)

>> How large are you letting the wal file grow between checkpoints?

Pretty much the entire database size.  For the tests I was starting
with a deleted database directory (ie no pre-existing files), and then
doing these pragmas:

  "pragma page_size=4096",
# "pragma mmap_size="+str(2*1024*1024*1024),
  "pragma journal_mode=wal",
  "pragma wal_autocheckpoint=1",
  "pragma foreign_keys=on",

Then I start a transaction, and do the importing within that
transaction.  The database file is 4kb during that process, the wal
file gets to be about 10GB.

If I use gdb to periodically break into the running process in the
mmap case, then it was always in sqlite3WalFindFrame.

I don't need any help fixing my importing process (eg don't need a
journal on an empty database anyway).  But it is frustrating that mmap
only goes up to a few kb shy of 2GB even for 64 bit, and I have one
example (ie anecdote not data) showing that mmap hurts for inserts on
> 2GB databases.  Perhaps it is worth others testing to see if this
> is
a systemic problem, or just bad luck for me :-)

It may also be relevant that tables add columns over time.  I
dynamically add them after encountering previously unseen fields in
the JSON.  However I'd expect the schema to be final a few thousand
records in.  Most tables have 3 to 6 million records.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXDmukACgkQmOOfHg372QSVtgCbBihGgIuZqS3Yy2JARXZ1+Q59
GmwAoMG53XxuLNhcMIw1PV46fD/Z/5tT
=luXx
-END PGP SIGNATURE-


[sqlite] MMAP performance with databases over 2GB

2015-08-06 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

In my testing I am finding that using SQLite's mmap support with
larger databases (final db size is ~10GB) to create the database (WAL
mode) is considerably (about three times) slower than no mmap.

The default max mmap limit is 2GB (sadly).  Has anyone else tested
mmap with >2GB databases and have results?  I don't know if this is a
peculiarity of my data set, or because of how SQLite is implemented.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlXDdQMACgkQmOOfHg372QRt3wCeKQpP9g2OhWS2yJg+iU7Gxvxo
TPUAn3ikDarecOaKLPIsnS3Xv+IltreU
=yUxV
-END PGP SIGNATURE-


[sqlite] CSV excel import

2015-07-30 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/30/2015 10:58 AM, Sylvain Pointeau wrote:
> is it possible? in a lot of cases, I cannot use sqlite (executable)
> because of the lack of a good CSV import. It would really great if
> this could be addressed.

Use the APSW shell:

  http://rogerbinns.github.io/apsw/shell.html

Behind the scenes it uses the Python CSV libraries which support a
number of dialects including excel (default) and excel-tab.

The APSW shell also supports an .autoimport command.  It automatically
works out dialects, separators and data types.  Here is the extended help:

 8< 
sqlite> .help autoimport

.autoimport FILENAME ?TABLE?  Imports filename creating a table
  and automatically working out
  separators and data types
  (alternative to .import command)

The import command requires that you precisely pre-setup the
table and schema, and set the data separators (eg commas or
tabs).  In many cases this information can be automatically
deduced from the file contents which is what this command does.
There must be at least two columns and two rows.

If the table is not specified then the basename of the file will
be used.

Additionally the type of the contents of each column is also
deduced - for example if it is a number or date.  Empty values
are turned into nulls.  Dates are normalized into -MM-DD
format and DateTime are normalized into ISO8601 format to allow
easy sorting and searching.  4 digit years must be used to detect
dates.  US (swapped day and month) versus rest of the world is
also detected providing there is at least one value that resolves
the ambiguity.

Care is taken to ensure that columns looking like numbers are
only treated as numbers if they do not have unnecessary leading
zeroes or plus signs.  This is to avoid treating phone numbers
and similar number like strings as integers.

This command can take quite some time on large files as they are
effectively imported twice.  The first time is to determine the
format and the types for each column while the second pass
actually imports the data.
 8< 

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlW6dDEACgkQmOOfHg372QRG5ACgt/OpLOPVZ40YQ4B2EWBf7yk0
HCUAniZLLUkPs7ac7cvCLb7Bn3zyNguC
=YpCt
-END PGP SIGNATURE-


[sqlite] why does SQLite support VARCHAR(X) when the doc says it ignores it?

2015-07-16 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/16/2015 02:46 PM, Hayden Livingston wrote:
> It seems that
> 
> CREATE TABLE A( something varchar(255) )
> 
> it just
> 
> CREATE TABLE A( something text )
> 
> Why have it at all?

Because other code and other humans look at the database too.
Consider it documentation of intent.  If it has no value to you, then
leave out the type.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWoJyIACgkQmOOfHg372QTu+wCghdSY6eDO5zXezGV+/xRXHHDr
rWUAn1XuuG65S/48qjyfx9TP/lzXYS0g
=Fzkn
-END PGP SIGNATURE-


[sqlite] Schema-less JSON SQLite DB?

2015-07-15 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/15/2015 08:22 AM, Sergej Jure?ko wrote:
> What do you guys think? Is it stupid, could it be improved?

I recommend looking at Mongodb  and Postgres first to see how they do
queries.  It would be better to be compatible with them where
practical, rather than being gratuitously different.

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWmkmwACgkQmOOfHg372QTHIACgi64Sm5nAmc4FyWrF6mimT5Ng
15sAoJJoY51QW/3QM/exRLJYHNtFKBse
=EgGt
-END PGP SIGNATURE-


[sqlite] Fwd: Usenix paper: Korean researchers invent sqlite WALDIO mode to circumvent redundant journaling by EXT4 on eMMC

2015-07-14 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/14/2015 06:53 AM, Mikael wrote:
> Does SQLite do the same, what about SQLite do the same, patch?

The SQLite authors rejected checksumming SQLite database pages.  The
existing integrity check will only catch issues that happen in
sufficiently important metadata, but in general won't catch corruption.

  http://www.sqlite.org/src/tktview?name=72b01a982a

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWlKfgACgkQmOOfHg372QTW0gCgn5PVs7z9G6FEu5dG31hbRNy1
jAIAniXv0ebDjsCuroOrkwI7D4Wszwno
=sV74
-END PGP SIGNATURE-


[sqlite] Schema-less JSON SQLite DB?

2015-07-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/13/2015 08:00 PM, Hayden Livingston wrote:
> Does your code also map object hierarchies in json?

Yes, but thankfully I don't have much of them.  Essentially the top
level of the object has a unique id (SQLite allocated), and then other
tables are used to join zero or more child objects to the top level.

> What general format do you use?

Ultimately I use Python dictionaries which are supersets of JSON
objects.  Some (ugly) code can convert both ways

> Each object type gets stored in a separate table?

In my case yes but that is because the underlying data represents
known entities and was actually originally in Postgres and then
exported to hideous inconsistent XML which I then convert/denormalise
back into JSON.

Do remember that SQLite does not require you to specify a type for
each column, nor does it care about the values in a column being
different types between rows.  That means I don't have to worry about
types, only the big picture top level of something being an object, a
list, or a scalar.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWkg0MACgkQmOOfHg372QRTbQCfRobbh4uU1Eqf0SD9LJxABFYj
iv8AoKUvXNQEgGVvmPiZ/tQgHyU7Q0yL
=S7AM
-END PGP SIGNATURE-


[sqlite] Schema-less JSON SQLite DB?

2015-07-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/13/2015 05:43 PM, Hayden Livingston wrote:
> Is there a concept of a schema-less JSON SQLite DB?

What exactly is it you want?  A schema-less database?  A JSON
database? Using SQLite to store JSON?  (It is unclear if you mean
something like SQLite, or SQLite itself).  Something small and light?
 A library not a server?

Be aware that you can dynamically update the SQLite schema, and that
it takes no time to do so.  Another database I won't name locks the
database and rewrites each record.  This can take hours or even days.
 SQLite just updates the schema table, and fills in defaults for
records written before the schema change, on reads.  I wrote some code
the other day that takes JSON and does the schema manipulation,
although it is annoying.  (It would be really great if SQLite
supported arrays like Postgres does.)

You can also store arbitrary JSON in a column.  You won't be able to
query it effectively, but you can duplicate values into columns you
can do queries and indexes on.

> My reason is simple: versioning. We have lot of business metrics
> that get updated let's say once a month, and we need to be agile to
> get them. Right now, we just put the version in the SQLite file,
> and then make sure no queries cross the boundaries.

You can also use multiple databases, and attach them.  For example if
you operate on a month by month basis, then you can put each month's
data in a separate SQLite file, then attach last month's as
'lastmonth', and use in queries (eg SELECT blah blah WHERE bar.foo >
lastmonth.bar.foo)

> Do others have experiences and requirements similar to this?

All the time.  I use SQLite when I don't need networked access, a
running database server, and need less things that can go wrong.
Mapping JSON into this is painful but possible.

When I can use a database server, I prefer Mongodb as it is very good
at arbitrary JSON in, the same arbitrary JSON back out.  It is
especially pleasant that the query syntax has the same shape as the
underlying JSON data.  Also JSON maps trivially to Python which I use
the most.  (Note however that Mongodb does have some issues, but so
does much other software out there.  Production use does require
effort & planning as does other software.)

In the longer term Postgres is getting increasingly better JSON
support.  Hopefully it will give Mongodb a run for its money soon.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWkbfsACgkQmOOfHg372QTFiwCgzzf5fEzfJdWn84bk0e1fusGE
qeAAnAhR+sHb39Gsha2Owq3SXsdZVRex
=9jcT
-END PGP SIGNATURE-


[sqlite] SIGSEGV rolling back in-memory DB‏‎

2015-07-09 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/08/2015 10:33 AM, Brian Soby wrote:
> Nothing appears to happen but it's not running very faithfully.
> Specifically, I get a bunch of problems on the rollbacks about the
> savepoints not existing even though it's replaying the original
> SAVEPOINT "x" command before the rollbacks. This is probably more
> due to me not knowing what's going on than anything else.

Do you have pysqlite configured with isolation level of None?  If you
use the defaults then it starts doing stuff on your behalf that you
wouldn't have captured.


https://github.com/ghaering/pysqlite/blob/51c1a18eac3d1d14a052b09e3546dd0c4ce70638/src/cursor.c#L607

Ultimately since you understand your code base best, you'll need to
either provide pure SQL that can reproduce the issue, or you'll need
to prove that there is no other component in the process that is
corrupting memory that happens to be used by SQLite.

To better catch issues with the SQLite shell, I recommend compiling it
with -DSQLITE_DEBUG -UNDEBUG and running under valgrind.  This will
improve the chances of catching memory issues and verify SQLite
internal assertion checking (there is a lot of it).

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWet/wACgkQmOOfHg372QT+ggCgwXiWdIhegZlUdDINi9OhN4G6
5gwAn22bk2Qgt2ljVHb5c7GBSYFkALAK
=5mrB
-END PGP SIGNATURE-


[sqlite] SIGSEGV rolling back in-memory DB‏‎

2015-07-08 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/08/2015 09:55 AM, Brian Soby wrote:
> It's single threaded.

Good - that eliminates a whole bunch of complications.

> I have a log of all of the queries run through the database up
> until the crash. It crashes on a rollback command.

What happens when you run those queries as a script for the SQLite
shell?  If that has problems then everything needed to reproduce the
issue is present.

However if that doesn't reproduce the issue then it is something else
in the process that is causing problems.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWdW5UACgkQmOOfHg372QTvJwCgrgS9gKjopjfppv6nPqTKSg1S
rR4AoLwvAXooKJn/1kNwVipyEDwEQ0KZ
=9y8e
-END PGP SIGNATURE-


[sqlite] SIGSEGV rolling back in-memory DB‏‎

2015-07-07 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/07/2015 02:06 PM, Brian Soby wrote:
> I don't believe it's a python specific issue since the core problem
> occurs in a rollback command internal to SQLlite.

Unless you configure it otherwise, pysqlite likes to parse your SQL
and do extra operations behind your back (eg automatically starting
and committing transactions).  pysqlite also doesn't do
multi-threading correctly - how do you do threading?

> I don't have an easy way to recreate the problem ...

That will make it very tricky.  Note that these kind of issues are
extremely rare in SQLite, and almost always caused by other bits of
code corrupting memory, and taking out SQLite as a side effect.  Try
disabling as many non-SQLite extensions as possible (that are
implemented in C).  Also use valgrind and similar tools as memory
checkers.

  https://www.sqlite.org/testing.html

Sadly pysqlite doesn't have an easy way of tracing SQL execution
without you modifying all call sites and updating its code to catch
the behind your back operations.

If you have the time and can update your code, you can try APSW as an
alternative to pysqlite (disclosure: I am the APSW author).  APSW
includes a tool that lets you see all queries and data returned, as
well as profiling information:

  http://rogerbinns.github.io/apsw/execution.html#apsw-trace

You may also find these two links relevant:

  http://rogerbinns.github.io/apsw/pysqlite.html
  https://pythonhosted.org/sqmediumlite/src/apswdbapi2.py.html

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWcVZwACgkQmOOfHg372QSOjACgoSXRAijHJncTHYY6VU4dfB6Q
gHIAoMT6zvydHZCvhNAFzX1azvFmMeZO
=mfkI
-END PGP SIGNATURE-


[sqlite] SIGSEGV rolling back in-memory DB‏‎

2015-07-07 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/07/2015 09:47 AM, Richard Hipp wrote:
> Can you send instructions on how to recreate the problem?  Please
> bare in mind that none of the core SQLite developers do much with
> python. Alternatively, you might send emails to Roger Binns and
> badger him into trouble-shooting the problem, since he is an expert
> at both python and SQLite and is likely to get to the bottom of the
> matter faster than any of the core SQLite devs.

No need to badger, I am happy to help :-)

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlWcEUcACgkQmOOfHg372QQXMACfdSKACGIZOwsAzOnpf2fwl1iO
UWgAn1n4OuC1/Cr9y6TfwK6QovmmgTEe
=j1D4
-END PGP SIGNATURE-


[sqlite] Extending VFS documentation

2015-05-24 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/24/2015 02:39 AM, Philip Bennefall wrote:
> I have looked at the example VFS implementations, but it is hard
> to determine which parts of the code that are implementation
> details as opposed to being part of the stable public API so to
> speak.

The answer is that it matters what you are exposing.  A vfs for a csv
file is a very different beast that one exposing bits of Amazon's
services.  I wrote one that exposes couchdb, but it wouldn't help
someone else exposing something different.

What I recommend you do is use one of the higher level languages that
exposes the VFS in their SQLite bindings.  You will get better error
messages, be able to write the code quicker, and explore behaviour
easier (eg xBestIndex).  ie you'll get a better on the VFS specific
issues, rather than having to build out C level plumbing.  Once you
have that understood, converting to C is simpler than having started
with C.

For example my Python wrapper (APSW) exposes the VFS, along with an
example and documentation.  It also easily lets you "inherit" from an
existing VFS so you only have to write methods where you want
different behaviour (eg mangling file names, or modifying how the
database file contents are stored).

  http://rogerbinns.github.io/apsw/vfs.html

You should find similar bindings in your higher level languages of choice.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEUEARECAAYFAlViNrAACgkQmOOfHg372QRrrQCYxRX6XbpyS39O5V+tJyyjXjx6
vACgxZ/Uu+TDvT22u1FN083YFMu2muc=
=wo2G
-END PGP SIGNATURE-


[sqlite] xBestIndex() implementation question

2015-05-13 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/13/2015 08:06 PM, Jilong Kuang wrote:
> Sure, that is one option. But I'm just wondering if inside the
> xBestIndex() we can get the value field, we can do a much better
> job to give the cardinality info.

Note that the value could be something other than a constant (eg an
expression like price > delivery_charge*0.9)

You are just helping SQLite pick the best index out of all the
possibilities, so as long as they compare sensibly it is fine.
SQLite's analyze command builds statistical information that is useful
(eg how many different values there are, how (un)evenly they are
distributed).  You could do something similar.

Or just follow Eric's suggestion :-)

Roger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlVUGasACgkQmOOfHg372QSPcACgpmvvhwEEmj9+ccVgI+qTaw0B
y/YAn2WtrppfLn0PvDpa+n1loqFYgzQW
=vskS
-END PGP SIGNATURE-


[sqlite] Suggestion for .import improvement

2015-05-11 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/11/2015 12:24 PM, tonyp at acm.org wrote:
> It?s a real nuisance to have to edit out the headers every time,
> especially when the files are very large and they do not load fast
> (or at all) into a text editor.  Plus, it?s semi-automatic.

I suggest you use a simpler way.  Create a temporary table and do the
.import into that.  You can then copy that table into your desired one:

  BEGIN
  CREATE TEMPORARY TABLE temptable .;
  .import log.csv temptable
  INSERT INTO desired SELECT * from temptable;
  COMMIT

You can explicitly delete the temp table, or it will automatically go
away when you close the connection.

The advantage of this approach is you can do processing on the
temporary table (eg convert nulls to empty string or vice versa, drop
header row), add a column (eg import date), have different columns
and/or names between the two (adjust the INSERT/SELECT statement) etc.
 This will all work from the shell, requires no programming, and will
import just as fast.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlVRVTYACgkQmOOfHg372QREYwCePcN1Z5RauOs7lo4jvtLNyx+t
018AnRJpES+JhBR5kM3+7m4Fa2p1I9t6
=oss6
-END PGP SIGNATURE-


[sqlite] Issue: Inconsistency of sqlite3_dbstat_register declaration

2015-05-11 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/11/2015 01:34 AM, Jan Nijtmans wrote:
> 2) If sqlite is not compiled with -DSQLITE_ENABLE_DBSTAT_VTAB=1,
> or the function is renamed (as it should) it wouldn't even work.

That was my scenario.  I just -DSQLITE_ENABLE_DBSTAT_VTAB and the code
didn't compile because of the inconsistency.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlVQvbgACgkQmOOfHg372QQ2eACdHWNbxVjmDGbzFtktgwADzwnH
rckAnRn30JWtYfBHzY0j5qG8tV8Kb8YO
=LfCW
-END PGP SIGNATURE-


[sqlite] Issue: Inconsistency of sqlite3_dbstat_register declaration

2015-05-10 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

There is a forward declaration of sqlite3_dbstat_register (currently
line 130871 in the 3.8.10.1 amalgamation):

  int sqlite3_dbstat_register(sqlite3*);

Later is the actual implementation (currently line 155838 in the
3.8.10.1 amalgamation):

  SQLITE_API int SQLITE_STDCALL sqlite3_dbstat_register(sqlite3 *db){

If SQLITE_API or SQLITE_STDCALL are set to anything then the
inconsistency can cause a compiler error.  In my case SQLITE_API is
defined as static.

Compilation then fails with gcc on Linux:

/space/apsw/sqlite3/sqlite3.c:155838:12: error: static declaration of
?sqlite3_dbstat_register? follows non-static declaration

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlVPrCoACgkQmOOfHg372QQEuwCgkCmhV/1ceig99me3QduAw7WL
qxUAniB4ZEF6KN52A/kAV/HZxVJ1kL9G
=686t
-END PGP SIGNATURE-


[sqlite] Bug: sqlite3_memory_used/highwater truncate to 32 bits

2015-05-09 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

The sqlite3_memory_used and highwater interfaces are defined to return
a 64 bit value.  They carefully call the 32 bit limited sqlite3_status
method and then cast to 64 bit.  Instead they should call
sqlite3_status64.

Reported indirectly via https://github.com/rogerbinns/apsw/issues/191

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iEYEARECAAYFAlVOsbwACgkQmOOfHg372QSkLgCgkqMS2hWqLFU0246u8aqx+JtN
NkcAn3iDFhK5/+pq5v0n6vTprxZflaXZ
=4QMd
-END PGP SIGNATURE-


  1   2   3   4   5   6   7   8   9   10   >