[sqlite] SQLite crashing

2016-01-28 Thread Igor Korot
Hi, ALL,


On Tue, Jan 26, 2016 at 3:08 PM, Clemens Ladisch  wrote:
> Igor Korot wrote:
>>  sqlite.dll!sqlite3_mutex_enter(sqlite3_mutex * p)  Line 19996 + 0xc 
>> bytesC
>>  sqlite.dll!sqlite3Close(sqlite3 * db, int forceZombie)  Line 726 + 0xc 
>> bytesC
>>  sqlite.dll!sqlite3_close(sqlite3 * db)  Line 772 + 0xe bytesC
>>  sqlite.dll!SQLiteDatabase::Disconnect(...)  Line 51 + 0xc bytesC++
>>  dialogs.dll!DisconnectFromDb(Database * db)  Line 108 + 0x13 bytes
>> C++
>>  docview.exe!MainFrame::~MainFrame()  Line 73 + 0xf bytesC++
>>  docview.exe!MainFrame::`scalar deleting destructor'()  + 0x16 bytes
>> C++
>>  docview.exe!wxAppConsoleBase::DeletePendingObjects()  Line 637 + 0x23 
>> bytesC++
>>  docview.exe!wxAppConsoleBase::ProcessIdle()  Line 445C++
>
> This looks OK.
>
> Are you ever calling sqlite3_shutdown()?
> Are you calling sqlite3_close() from more than one place, or more than once?
> (Add logging to find out.)

I tried to recompile the project with the /MTd, but I got this:

1>odbccp32.lib(dllload.obj) : error LNK2019: unresolved external
symbol __imp___vsnprintf referenced in function
_StringVPrintfWorkerA at 20

And here is my Linker Project Settings:

odbccp32.lib;kernel32.lib;user32.lib;gdi32.lib;comdlg32.lib;winspool.lib;winmm.l?
ib;shell32.lib;shlwapi.lib;comctl32.lib;ole32.lib;oleaut32.lib;uuid.lib;rpcrt4.li?
b;advapi32.lib;version.lib;wsock32.lib;wininet.lib;%(AdditionalDependencies)

Is there an easy way to fix this?

Thank you.

>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Yannick Duchêne
On Thu, 28 Jan 2016 22:08:02 +0200
R Smith  wrote:

> I think you are misunderstanding the Pragma and the idea of automatic 
> indices. An automatic Index might be created on a table that doesn't 
> have an adequately assigned primary key. It might also be created during 
> a query (mostly SELECT or sub-SELECT) for which there is no useful Index 
> created by the table designer and the Query planner figures it will be 
> quicker to make an Index than to do table scans through the query. This 
> habit of creating indices during select queries can be forced to not 
> happen by setting the "PRAGMA automatic_index=0;", but this needs to 
> happen when you open the DB connection, or at a minimum, before you try 
> any query - not after the cursor is created,  by that time the index 
> might already be made. (This is why you are not seeing any speed 
> improvement).
> 
> I only mention all the above so you understand what the automatic 
> indexing is about, but it has almost certainly nothing to do with your 
> query slowness, and even if you switch it off at a more opportune time, 
> I would be surprised if it changes the query speed.

Indeed, I was misunderstanding. Are these indexes visible? Is there any 
relation with the ones I see from sqlitebrowser?

> 103k insertions to produce 15k rows... that is ~15% efficiency - the 
> opposite of good design. Perhaps we can help you find better SQL to 
> solve your problem. Let's see...

(red-face)

> 
> I do not know the shape of your data (it matters), but I'm guessing 
> "element" represents standard HTML tags with "attribute" and "value" 
> giving basic expansion of the attributes list. it'd probably be safer to 
> use non case-sensitive values and use standard equation tests in Selects.
> 

The shape will change, it's far from final. The tables at that step depends on 
the next steps in the overall procedure, which is not entirely fixed for now.

> Could you try these in your Query loop perhaps:
> 
> 
> -- Table: Renamed field count to cnt because "count" is an SQL reserved 
> word (though it will work, just a better habit)
> CREATE TABLE element_attribute_values (
>element TEXT COLLATE NOCASE NOT NULL,
>attribute TEXT COLLATE NOCASE NOT NULL,
>value TEXT COLLATE NOCASE NOT NULL CHECK ((attribute <> '') OR (value 
> = '')),
>cnt INTEGER NOT NULL DEFAULT 0,
>PRIMARY KEY (element, attribute, value)
> );
> 
> -- Insert Loop start:
> -- This will simply fail if the PK already exists, else start the line 
> with 0 count.
> INSERT OR IGNORE INTO element_attribute_values VALUES (:element, 
> :attribute, :value, 0);
> 
> -- This might be faster since it uses only one lookup loop, but it might 
> also not be.
> -- If you share a list of example data to be inserted, we can find a 
> faster way. Try it and let us know...
> WITH EAV(id,icnt) AS (
>SELECT rowid,cnt+1
>  FROM element_attribute_values
> WHERE (element = :element) AND (attribute = :attribute) AND (value = 
> :value)
> LIMIT 1
> )
> UPDATE element_attribute_values SET cnt = (SELECT icnt FROM EAV)
>   WHERE rowid = (SELECT id FROM EAV)
> ;
> -- Insert Loop end.
> 

The default may even be omitted, and may be the `LIMIT 1` too, as each triplet 
is unique.

I tried it, with the variant (no `LIMIT 1`) and it's about the same, even a bit 
more slow (not much, between one and two seconds more long). While speed is not 
the only concern, or perhaps I should not care that much about the DB file size 
(the other matters).

It's close to what I had at the beginning, which gave similar timings (just 
changed to match your recommendation about "cnt"'s name):

INSERT OR IGNORE INTO element_attribute_values
  VALUES (:element, :attribute, :value, 0);
UPDATE element_attribute_values 
  SET cnt = cnt + 1
  WHERE (element = :element)
AND (attribute = :attribute)
AND (value = :value);

> -- If you share a list of example data to be inserted, we can find a 
> faster way. Try it and let us know...

I don't mind, I can upload an archive somewhere. I guess you mean table's 
content?

I'm aware this use case may be a bit pathological, as I could use Python's 
dictionary. However, I decided to not to, for three reasons:

 * I wanted to see what it's like to use an SQLite DB as an application data 
container (file or memory)?;
 * Using a persistent DB is better for incremental process (may stop and resume 
later)?;
 * Persistent data is welcome for human review (I think about defining views in 
sqlitebrowser to dissect the results)?;

For persistence, I first tried CSV files, but this shows to be a inadequate. An 
SQL DB and a DB browser, looks better than CSV for this use?case.

-- 
Yannick Duch?ne


[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread R Smith


On 2016/01/28 10:08 PM, R Smith wrote:
>
>
> -- Insert Loop start:
> -- This will simply fail if the PK already exists, else start the line 
> with 0 count.
> INSERT OR IGNORE INTO element_attribute_values VALUES (:element, 
> :attribute, :value, 0);
>
> -- This might be faster since it uses only one lookup loop, but it 
> might also not be.
> -- If you share a list of example data to be inserted, we can find a 
> faster way. Try it and let us know...
> WITH EAV(id,icnt) AS (
>   SELECT rowid,cnt+1
> FROM element_attribute_values
>WHERE (element = :element) AND (attribute = :attribute) AND (value 
> = :value)
>LIMIT 1
> )
> UPDATE element_attribute_values SET cnt = (SELECT icnt FROM EAV)
>  WHERE rowid = (SELECT id FROM EAV)
> ;
> -- Insert Loop end.
>

Upon re-reading I realised my descriptions were confusing. To be clear, 
this is the entire loop SQL that should form part of your program:

-- Insert Loop start:
INSERT OR IGNORE INTO element_attribute_values VALUES (:element,
:attribute, :value, 0);

WITH EAV(id,icnt) AS (
   SELECT rowid,cnt+1
 FROM element_attribute_values
WHERE (element = :element) AND (attribute = :attribute) AND
(value = :value)
LIMIT 1
)
UPDATE element_attribute_values SET cnt = (SELECT icnt FROM EAV)
  WHERE rowid = (SELECT id FROM EAV)
;
-- Insert Loop end.




[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread R Smith


On 2016/01/28 8:17 PM, Yannick Duch?ne wrote:
> Hi people,
>
> I though automatic index was controlled by the pragma `automatic_index`, 
> which seems to do nothing for me, unless I'm doing it the wrong way, while 
> `WITHOUT ROWID` on table creations, seems to effectively prevent automatic 
> indexes, the latter with or without the `PRAGMA automatic_index=0`.
>
> Is this expected or is this me doing something wrong?
>
> I'm concerned about it mainly for file size, more than for speed, while it 
> matters too, as the program is far more slow than expected. For a program I'm 
> running I got this:
>
>* With ROWID: execution time is 22 seconds and DB file size is 1.9MB?;
>* Without ROWID, execution time is 19 seconds and DB file size is 788KB 
> (2.4 times smaller)?;
>* With insertion disabled, execution time is 6 seconds, so most of the 
> execution time is spent in rows insertion.
>
> The program indexes HTML files (*), the test runs on only 52 average sized 
> HTML files, so the difference will be even more important with bigger files 
> set.
>
> I'm feeling lost about this: the pragma seems to do nothing and it's `WITHOUT 
> ROWID` on table creations, which seems to do it instead. The pragma is 
> executed right after the cursor creation. I'm using APSW (**). Pragma seems 
> to works, as I tested with `PRAGMA synchronous=0` (just for a test, I really 
> don't want to use this).

I think you are misunderstanding the Pragma and the idea of automatic 
indices. An automatic Index might be created on a table that doesn't 
have an adequately assigned primary key. It might also be created during 
a query (mostly SELECT or sub-SELECT) for which there is no useful Index 
created by the table designer and the Query planner figures it will be 
quicker to make an Index than to do table scans through the query. This 
habit of creating indices during select queries can be forced to not 
happen by setting the "PRAGMA automatic_index=0;", but this needs to 
happen when you open the DB connection, or at a minimum, before you try 
any query - not after the cursor is created,  by that time the index 
might already be made. (This is why you are not seeing any speed 
improvement).

I only mention all the above so you understand what the automatic 
indexing is about, but it has almost certainly nothing to do with your 
query slowness, and even if you switch it off at a more opportune time, 
I would be surprised if it changes the query speed.

>
>
> I don't know if it's OK to ask for this kind of question here. If it's not, 
> just tell me, I won't bother.

It's always Ok to ask here. Welcome to SQLite-World. :)


>
>
> (*) There are bout 103 000 insertions queries ending into 15 600 rows, using 
> this:

103k insertions to produce 15k rows... that is ~15% efficiency - the 
opposite of good design. Perhaps we can help you find better SQL to 
solve your problem. Let's see...

>
>  CREATE TABLE element_attribute_values
>(element TEXT NOT NULL,
> attribute TEXT NOT NULL,
> value TEXT NOT NULL
>   CHECK ((attribute IS NOT '') OR (value IS '')),
> count INTEGER NOT NULL DEFAULT 1,
> PRIMARY KEY (element, attribute, value))
>
>  INSERT OR REPLACE INTO element_attribute_values
>(element, attribute, value, count)
>VALUES (:element, :attribute, :value,
>  (SELECT count + 1 FROM element_attribute_values
>WHERE (element IS :element)
>  AND (attribute IS :attribute)
>  AND (value IS :value)))
>
> (**) Standard Python SQLite3 binding seems perceptibly slow. With APSW, the 
> program is 17 times faster than with standard Python SQLite3 binding?? 
> execution time is around 370 seconds in the latter case, whatever how I 
> manage transactions.
>
>

APSW is far better and maintained by Roger Binns (whom you might see 
post here sometimes). Using that is your best bet, but it also cannot 
help heavy-set queries. The best strategy is to make better queries.

I do not know the shape of your data (it matters), but I'm guessing 
"element" represents standard HTML tags with "attribute" and "value" 
giving basic expansion of the attributes list. it'd probably be safer to 
use non case-sensitive values and use standard equation tests in Selects.

Could you try these in your Query loop perhaps:


-- Table: Renamed field count to cnt because "count" is an SQL reserved 
word (though it will work, just a better habit)
CREATE TABLE element_attribute_values (
   element TEXT COLLATE NOCASE NOT NULL,
   attribute TEXT COLLATE NOCASE NOT NULL,
   value TEXT COLLATE NOCASE NOT NULL CHECK ((attribute <> '') OR (value 
= '')),
   cnt INTEGER NOT NULL DEFAULT 0,
   PRIMARY KEY (element, attribute, value)
);

-- Insert Loop start:
-- This will simply fail if the PK already exists, else start the line 
with 0 count.
INSERT OR IGNORE INTO element_attribute_values VALUES (:element, 
:attribute, :value, 0);

-- This might be faster since 

[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Yannick Duchêne
On Thu, 28 Jan 2016 19:59:02 +0100
Dominique Devienne  wrote:

> Not sure to understand what you expect this pragma to do, but inserts
> typically don't involve automatic indexes, which are used only in queries
> (selects).

I though it was responsible for the `sqlite_autoindex_"table_name"_N` which 
gets generated and augments the DB size. That's the name and the documentation 
which makes me believe this. I can see these index in sqlitebrowser and at the 
file size.

> Your insert does have a query, but it fits the PK, so no automatic index is
> needed. (and you can likely use AUTOINCREMENT to avoid that query in the
> first place).

"count" is not a primary key, and AUTOINCREMENT may be used only on primary key.

It indeed does not need a primary key, a unique constraint is enough. Whether 
or not I define a PK or a unique constraint, it ends into the same speed and DB 
size. I just wanted to test if it differs when a PK is defined (it does not).


> The 2.4x size difference is kinda expected, since your PK includes your
> biggest column I suspect (i.e. value), so both the table, and its PK index
> most store it, doubling at least the memory, and with page waste, clearly
> can account for the difference.
> 
> And the time difference can be in-part at least attributed to the increased
> IO.

That was my conclusion too, and that's why I would like to prevent the creation 
of this extraneous index. I can achieve it using `WITHOUT ROWID`, and not with 
the pragma, and that's what I don't understand. So, I can achieve it, just that 
it's in a way I don't understand, which give me the feeling I'm using a trick 
and I don't like it. Unless ROWID is indeed what's responsible for the creation 
of these extraneous indexes??

> It seems unusual to use the HTML text in the PK. Are you sure you need it?

It's short texts (one to ten characters for an average), just element name, 
attribute name and attribute value, unique triplets with a count of their 
occurences. There are only a few attribute values which may be a bit long, the 
href of  elements. I must say I simplified my explanations, it's not really 
the index, it's a prior step which makes some statistics which are later used 
to decide what elements and attributes will be focused on to index the 
contents. The text the final index stores is reasonably short title-like texts.


-- 
Yannick Duch?ne


[sqlite] Is there an index of conformances and deviations against standard SQL?

2016-01-28 Thread Robert Weiss
In partial answer to the original question:
http://sqlite.org/omitted.html

http://sqlite.org/nulls.html

http://sqlite.org/conflict.html

Furthermore any non-standard extension, such as FTS3 and ATTACH, might be 
regarded as a deviation from the standard.

And after consulting https://en.wikipedia.org/wiki/SQL-92: does SQLITE have a 
MUMPS binding or a bit string variable type?
Bob Weiss 

On Thursday, January 28, 2016 12:06 PM, Simon Slavin  wrote:



On 28 Jan 2016, at 6:41pm, Yannick Duch?ne  wrote:

> I wonder if this is the same with SQLite4

You can ignore SQLite4.? It is purely experimental at the moment and any 
feature may change or disappear or work differently before launch.? Nobody 
should be using SQLite4 for anything important.

Simon.
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users





[sqlite] Is there an index of conformances and deviations against standard SQL?

2016-01-28 Thread Simon Slavin

On 28 Jan 2016, at 6:41pm, Yannick Duch?ne  wrote:

> I wonder if this is the same with SQLite4

You can ignore SQLite4.  It is purely experimental at the moment and any 
feature may change or disappear or work differently before launch.  Nobody 
should be using SQLite4 for anything important.

Simon.


[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Dominique Devienne
On Thu, Jan 28, 2016 at 7:17 PM, Yannick Duch?ne 
wrote:

> I though automatic index was controlled by the pragma `automatic_index`,
> which seems to do nothing for me, unless I'm doing it the wrong way, while
> `WITHOUT ROWID` on table creations, seems to effectively prevent automatic
> indexes, the latter with or without the `PRAGMA automatic_index=0`.
>
> Is this expected or is this me doing something wrong?
>

Not sure to understand what you expect this pragma to do, but inserts
typically don't involve automatic indexes, which are used only in queries
(selects).

And automatic indexes are used only if there isn't an suitable index
already, and SQLite query planner things one will help.

Your insert does have a query, but it fits the PK, so no automatic index is
needed. (and you can likely use AUTOINCREMENT to avoid that query in the
first place).

The 2.4x size difference is kinda expected, since your PK includes your
biggest column I suspect (i.e. value), so both the table, and its PK index
most store it, doubling at least the memory, and with page waste, clearly
can account for the difference.

And the time difference can be in-part at least attributed to the increased
IO.

It seems unusual to use the HTML text in the PK. Are you sure you need it?
--DD


[sqlite] Is there an index of conformances and deviations against standard SQL?

2016-01-28 Thread Yannick Duchêne
On Wed, 27 Jan 2016 20:10:01 +
Simon Slavin  wrote:

> There are later standards for SQL and SQLite and other SQL implementations 
> include some of the features in them.  But at this point different 
> implementations start to diverge from one-another far more.

With this, you fully answered the question :-)

> Given what you wrote you should definitely read SQL1992.  It is the core 
> standard for SQL and you will find some ways in which SQLite is not SQL1992.  
> However most SQL implementations implement most things in SQL1992.

Given what you wrote, I definitely should, indeed (will just have to print it, 
as it is a text file).

By the way, I learned about another divergence: with ROWID, SQLite allows NULL 
in PRIMARY KEY columns.

I wonder if this is the same with SQLite4 (I don't know it, as Python does not 
support it)


-- 
Yannick Duch?ne


[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Keith Medcalf
cr.execute('BEGIN IMMEDIATE')
try:
  rowid = cr.execute('select rowid from element_attribute_values where 
element=? and attribute=? and value=?', (element, attribute, 
value)).fetchone()[0]
  cr.execute('update element_attribute_values set count=count+1 where rowid=?', 
(rowid,))
except:
  cr.execute('insert into element_attribute_values values (?, ?, ?, 1)', 
(element, attribute, value))
cr.execute('COMMIT')


could be replaced by (the below will probably be faster also):

cr.execute('BEGIN IMMEDIATE')
cr.execute('insert or ignore into element_attribute_values values (?, ?, ?, 
0)', (element, attribute, value))
cr.execute('update element_attribute_values set count=count+1 where element=? 
and attribute=? and value=?', (element, attribute, value))
cr.execute('COMMIT')


same caveat that if you want more speed you will have to move your transaction 
out to cover more updates.







[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Keith Medcalf

> > Not sure to understand what you expect this pragma to do, but inserts
> > typically don't involve automatic indexes, which are used only in
> > queries
> > (selects).
> 
> I though it was responsible for the `sqlite_autoindex_"table_name"_N`
> which gets generated and augments the DB size. That's the name and the
> documentation which makes me believe this. I can see these index in
> sqlitebrowser and at the file size.

When you create a table thusly:

create table x (x primary key, y, z);

you are creating a rowid table with columns x, y, an z.  You are also saying 
that you want x to be the primary key.  Therefore, you will get a table called 
x containing (ROWID, X, Y, Z).  This is a btree where the index is the rowid.  
In order to implement you PRIMARY KEY, an index must be constructed containing 
the column X and ROWID in the table where the value of X is found.  This index 
will be called sqlite_autoindex_x_1 because it is the first index created 
automatically on table x.

You could achieve the same result as follows:

create table x (x, y, z);
create unique index pk_x_x on x(x);

except that now you have given the index on x an explicit name and one does not 
have to be manufactured for you.

Similarly, if you used:

create table x (x primary key, y unique, z);

you would get a table containing (rowid, x, y, z) and an sqlite_autoindex_x_1 
implementing the unique key on x, and an sqlite_autoindex_x_2 implementing the 
unique constraint (index) on y.  If you wanted to explicitly specify the names 
of the indexes then you could say:

create table x (x, y, z);
create unique index pk_x_x on x(x);
create unique index uk_x_y on x(y);

The end result is the same.  These are indexes that you have DECLARED that you 
want and are entirely different from automatic indexes used to increase the 
performance of queries.

when you create a table thusly

create table x (x primary key, y, z) without_rowid;

you are creating a table without a rowid.  The table x consists entirely of an 
index where the indexed value is x, and the index "result" is y, z.  There is 
no actual table.

The problem with your original query is the use of INSERT or REPLACE.  Do not 
do that.  REPLACE is implemented as a DELETE (which means that all the indexes 
need to be updated and the tree's rebalanced) followed by an INSERT (which 
requires all the indexes to be updated and the trees to be rebalanced).  
Instead you should determine what you want to do and do it.  For example:

cr.execute('BEGIN IMMEDIATE')
try:
  rowid = cr.execute('select rowid from element_attribute_values where 
element=? and attribute=? and value=?', (element, attribute, 
value)).fetchone()[0]
  cr.execute('update element_attribute_values set count=count+1 where rowid=?', 
(rowid,))
except:
  cr.execute('insert into element_attribute_values values (?, ?, ?, 1)', 
(element, attribute, value))
cr.execute('COMMIT')


You can speed it up even more by moving your transactions out to cover multiple 
operations.  
In this way you never need to delete then re-insert the same row.

Remember, the only way to optimize I/O is not to do it.








[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-01-28 Thread Yannick Duchêne
Hi people,

I though automatic index was controlled by the pragma `automatic_index`, which 
seems to do nothing for me, unless I'm doing it the wrong way, while `WITHOUT 
ROWID` on table creations, seems to effectively prevent automatic indexes, the 
latter with or without the `PRAGMA automatic_index=0`.

Is this expected or is this me doing something wrong?

I'm concerned about it mainly for file size, more than for speed, while it 
matters too, as the program is far more slow than expected. For a program I'm 
running I got this:

  * With ROWID: execution time is 22 seconds and DB file size is 1.9MB?;
  * Without ROWID, execution time is 19 seconds and DB file size is 788KB (2.4 
times smaller)?;
  * With insertion disabled, execution time is 6 seconds, so most of the 
execution time is spent in rows insertion.

The program indexes HTML files (*), the test runs on only 52 average sized HTML 
files, so the difference will be even more important with bigger files set.

I'm feeling lost about this: the pragma seems to do nothing and it's `WITHOUT 
ROWID` on table creations, which seems to do it instead. The pragma is executed 
right after the cursor creation. I'm using APSW (**). Pragma seems to works, as 
I tested with `PRAGMA synchronous=0` (just for a test, I really don't want to 
use this).


I don't know if it's OK to ask for this kind of question here. If it's not, 
just tell me, I won't bother.


(*) There are bout 103 000 insertions queries ending into 15 600 rows, using 
this:

CREATE TABLE element_attribute_values
  (element TEXT NOT NULL,
   attribute TEXT NOT NULL,
   value TEXT NOT NULL
 CHECK ((attribute IS NOT '') OR (value IS '')),
   count INTEGER NOT NULL DEFAULT 1,
   PRIMARY KEY (element, attribute, value))

INSERT OR REPLACE INTO element_attribute_values
  (element, attribute, value, count)
  VALUES (:element, :attribute, :value,
(SELECT count + 1 FROM element_attribute_values
  WHERE (element IS :element)
AND (attribute IS :attribute)
AND (value IS :value)))

(**) Standard Python SQLite3 binding seems perceptibly slow. With APSW, the 
program is 17 times faster than with standard Python SQLite3 
binding???execution time is around 370 seconds in the latter case, whatever how 
I manage transactions.


-- 
Yannick Duch?ne


[sqlite] Wish List for 2016: High Level API for Object Oriented Interactive Languages

2016-01-28 Thread Jim Callahan
Today, I found a Python package, "JayDeBeApi" that accesses SQLite via JDBC
rather than directly through the C language Call Level Interface (CLI).
https://github.com/baztian/jaydebeapi/blob/master/README.rst

This might provide the object oriented interface I have been looking for!
Has anyone tried it?

The SQLite3 CLI matches the specifications of an interface that was
intended for the C programming language. The CLI specs were intended for C
and COBOL not designed with an object oriented language in mind.
"Developed in the early 1990s, the API was defined only for the programming
languages C  and
COBOL ."
https://en.wikipedia.org/wiki/Call_Level_Interface

By  contrast, JDBC was designed for Java (an object oriented language) as a
substitute for Microsoft's ODBC.

Now that a Python developer is using JDBC instead of CLI we may get a more
object oriented interface:

"JDBC connections support creating and executing statements.  ...Query
statements return a JDBC row result set. The row result set is used to walk
over the result set . Individual
columns  in a row are
retrieved either by name or by column number. There may be any number of
rows in the result set. The row result set has metadata that describes the
names of the columns and their types."
https://en.wikipedia.org/wiki/Java_Database_Connectivity

As I recall there are at least three JDBC drivers for SQLite3, one by
Werner
http://www.ch-werner.de/javasqlite/

one by Saito (Xerial)
https://github.com/xerial/sqlite-jdbc/blob/master/README.md

and Saito's was forked (with attribution from Zentus)
https://github.com/crawshaw/sqlitejdbc

I am hopeful this new JDBC based interface will provide as satisfactory
high level channel between SQLite3 and Python.

Jim Callahan

On Thu, Jan 7, 2016 at 7:55 PM, Warren Young  wrote:

> On Jan 7, 2016, at 5:22 PM, Jim Callahan 
> wrote:
> >
> > I believe R has remarkably good interface packages for SQLite
>
> That?s the appropriate level: the high-level language's DB access layer
> should map the low-level C record-at-a-time API to an appropriate
> language-level abstraction.
>
> R almost forces you to do this because of things like data.frame.  But,
> that?s no argument for other HLL DBMS API writers not to provide similar
> affordances.
>
> I?ve been involved with two different C++ DBMS wrapper libraries, and both
> of them provide a way to get a std::vector<> as a result set instead of
> iterate over individual rows.  As with R?s SQLite wrapper, I felt it was my
> C++ code?s responsibility to do this repackaging, not the underlying C DBMS
> access API.
>
> That?s not to say that the SQLite C API has no warts:
>
> 1. sqlite3_column_*() uses 0-based indices but sqlite3_bind_*() uses
> 1-based indices.  I can cope with either base, but please pick one!  (And
> make it the correct base for programming, 0.  (Yes, I know I just praised R
> above.  R?s use of 1-based arrays is WRONG.))
>
> 2. There is no ?preview? mechanism.  That is, you can?t bind some
> parameters to a prepared query string and then get the resulting SQL
> because SQLite substitutes the values into the query at a layer below the
> SQL parser.  This means that if you have an error in your SQL syntax or
> your parameters cause a constraint violation, your debug logging layer can
> only log the prepared query string, not the parameters that went into it,
> which makes it unnecessarily difficult to determine which code path caused
> you to get the error when looking at logs of a running system.
>
> 3. The query finalization code could be less picky.  If I prepare a new
> query without finalizing the previous one, I?d rather that SQLite didn?t
> punish me by throwing errors unless I put it into a ?lint? mode.  Just toss
> the half-finished prior query and move on, please.
>
> 4. There are several signs of backwards compatible extensions which make
> the API more complex than if it were designed with the features from the
> start.  (e.g. _v2() APIs, the various ways to get error codes, etc.)
> Hopefully those doing the SQLite4 effort will feel free to break the API,
> jettisoning this historical baggage.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


This
email has been sent from a virus-free computer protected by Avast.
www.avast.com

<#DDB4FAA8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-01-28 Thread Howard Chu
Simon Slavin wrote:
>
> On 28 Jan 2016, at 1:38pm, Bernard McNeill  wrote:
>
>> ===
>> Like the user reading ?saving OK? and throwing away the
>> Post-It with the original information
>> ===
>>
>> This is exactly my concern.
>> The user throwing away the Post-It is entirely reasonable if he sees a
>> message like that.
>>
>> Do you happen to know if Linux/Debian (which I think uses a journalling
>> filesystem) carries this risk?
>
> The problem is not at the software level.

Not true. There *is* a problem at the software level - on Linux, current BSD 
(and apparently also on QNX) you must fsync the containing directory when you 
make changes to the contents of a directory (create/delete/rename files). This 
is above and beyond whatever lies the hardware layer may tell you. It's a 
documented requirement in Linux, at least. It is also independent of whether 
or not the filesystem uses journaling.

-- 
   -- Howard Chu
   CTO, Symas Corp.   http://www.symas.com
   Director, Highland Sun http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-01-28 Thread Simon Slavin

On 28 Jan 2016, at 1:38pm, Bernard McNeill  wrote:

> ===
> Like the user reading ?saving OK? and throwing away the
> Post-It with the original information
> ===
> 
> This is exactly my concern.
> The user throwing away the Post-It is entirely reasonable if he sees a
> message like that.
> 
> Do you happen to know if Linux/Debian (which I think uses a journalling
> filesystem) carries this risk?

The problem is not at the software level.  Various operating systems and file 
systems are correctly programmed with regard to waiting for write commands to 
complete.  I don't know specifically about Debian but Linux has a good 
reputation for such things, and anyone who bothers to write a journalling file 
system would understand how to do things properly.

The problem is at the hardware level.  Standard disk drives (including their 
motherboard if they have one, and their firmware) are designed for speed, not 
integrity.  The assumption is that you will be using them to play games or 
write your CV in Word, not to keep vital data.  So they are set up, using their 
default jumper positions, to lie.  In order to keep their computer running as 
fast as possible, instead of

1) receive write command
2) perform write command
3) read that bit of disk to confirm the change
4) if not, bring SMART system into play and try writing it somewhere else
5) if succeed, tell the computer "I wrote that and it worked."
6) otherwise tell the computer "I wrote that and it failed."

they do this

1) receive write command
2) tell the computer "I wrote that and it worked."
3) perform write command
4) read that bit of disk to confirm the change
5) if not, bring SMART system into play and try writing it somewhere else

Doing things this way around means that the computer has to wait less than half 
the time to continue with the next task.  Which is what users want.

Notice that with this behaviour it doesn't matter how fault-tolerant or ACID 
your file system is, because it has to trust its storage system and the storage 
system is lying to it.

You can get server-quality disks (sometimes called "enterprise drives") which 
do this correctly.  They cost more because they have to survive longer and more 
complicated testing at the factory before they're sold to you.  And if you put 
one in someone's computer they'd complain that it takes 20 seconds to load a 
Facebook page and they get lousy frame rates on their favourite game.

Simon.


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-01-28 Thread Meinlschmidt Stefan
> ===
> Like the user reading ?saving OK? and throwing away the
> Post-It with the original information
> ===
> 
> This is exactly my concern.
> The user throwing away the Post-It is entirely reasonable if he sees a
> message like that.
> 
> Do you happen to know if Linux/Debian (which I think uses a journalling
> filesystem) carries this risk?

I do not know Linux' current behaviour, especially with recent
filesystems. I seem to remember from last century that Linux by default
automatically issues a sync every 30s. So every filesystem that is
writable and uses a write-back-cache (i.e. that doesn't right writeaway)
would have a 0s?30s amnesy window. (V)FAT would be a candidate, but SD
cards usually are parameterized at mount to actually write through, i.e.
are less efficient but have no problem. I have no clue what journalling
file systems do here.

S.M.
-- 
Dipl.-Phys. (Univ) Stefan Meinlschmidt, Senior Software Engineer
Am Wolfsmantel 46, 91058 Tennenlohe, Germany
Tel: +49-8458-3332-531  stefan.meinlschmidt at esolutions.de
Fax: +49-8458-3332-20-531


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-01-28 Thread Bernard McNeill
===
Like the user reading ?saving OK? and throwing away the
Post-It with the original information
===

This is exactly my concern.
The user throwing away the Post-It is entirely reasonable if he sees a
message like that.

Do you happen to know if Linux/Debian (which I think uses a journalling
filesystem) carries this risk?

Best regards


On Thu, Jan 28, 2016 at 8:00 AM, Meinlschmidt Stefan <
Stefan.Meinlschmidt at esolutions.de> wrote:

> >> Using the standard defaults (which avoid WAL), is there any
> >> possibility whatsoever of that last SQL transaction being lost?
> >
> > I have an unusual answer: Yes, and it doesn't matter.
>
> While I agree in principle, your answer depends on some assumptions that
> need not hold.
>
> > Let's suppose, as you did, that the application got back "Commit
> > Successful" followed quickly by a power failure.  You want to know,
> > could the transaction be lost anyway?  But I ask you, what action could
> > the application possibly take, in that subsecond interval, that it
> > matters?
>
> Under the QNX OS using a QNX6 filesystem with default configuration,
> that ?subsecond interval? is actually up to 10s. For any non-journalling
> filesystem (SD cards, anyone?) mounted without immediate write-through
> (for efficiency) on Linux the interval can be, IIRC, up to 30s. So how
> much can happen in this period is very sensitive to details not
> necessarily under control of or even available to the SQLite user.
>
> The application could for example write to some non-SQLite storage
> (other file system, raw flash, physical journal printout, ?) and try to
> guarantee global consistency by waiting for the SQLite transaction to
> complete. Like the user reading ?saving OK? and throwing away the
> Post-It with the original information. Or (what we did) it could shut
> off device power.
>
> > There is no God's-eye view of application state.  The important service
> > provided by the DBMS is not "what's committed is definitely saved", but
> > rather that "what's committed is definitely *consistent*".
>
> So when your application requires consistency of some broader scope, you
> need the DBMS give you enough rope^h^h^h^h^h^h^h^h^h^h^hthe tools to
> implement that yourself. Without a durability guarantee you're screwed.
>
> The more frequent simpler usecases of course are not susceptible to that
> and then indeed it doesn't matter.
>
> S.M.
> --
> Dipl.-Phys. (Univ) Stefan Meinlschmidt, Senior Software Engineer
> Am Wolfsmantel 46, 91058 Tennenlohe, Germany
> Tel: +49-8458-3332-531  stefan.meinlschmidt at esolutions.de
> Fax: +49-8458-3332-20-531
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Build fail: undefined reference to `pthread_create'

2016-01-28 Thread jungle Boogie
Hi Robert,
On 27 January 2016 at 14:20, Robert Weiss  wrote:
> A few builds ago I got a similar message.  I don't recall whether I actively 
> tried to use threads or got blindsided by change.  The short story is that 
> linking in libpthread caused the build to succeed.  The longer story is that 
> threads were unreliable in my environment (Windows/cygwin), caused errors, 
> and I dropped them from later builds.

Hanno and I chatted yesterday and he provided a patch for me to test
that worked.
It was implemented in the commit below and now works as expected!
https://www.sqlite.org/src/info/47633ffdbfead3ce

Thanks to Hanno, DRH and you for your input.

> Bob Weiss
>

Thanks,
sean


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-01-28 Thread Meinlschmidt Stefan
>> Using the standard defaults (which avoid WAL), is there any
>> possibility whatsoever of that last SQL transaction being lost?
> 
> I have an unusual answer: Yes, and it doesn't matter.  

While I agree in principle, your answer depends on some assumptions that
need not hold.

> Let's suppose, as you did, that the application got back "Commit
> Successful" followed quickly by a power failure.  You want to know,
> could the transaction be lost anyway?  But I ask you, what action could
> the application possibly take, in that subsecond interval, that it
> matters?

Under the QNX OS using a QNX6 filesystem with default configuration,
that ?subsecond interval? is actually up to 10s. For any non-journalling
filesystem (SD cards, anyone?) mounted without immediate write-through
(for efficiency) on Linux the interval can be, IIRC, up to 30s. So how
much can happen in this period is very sensitive to details not
necessarily under control of or even available to the SQLite user.

The application could for example write to some non-SQLite storage
(other file system, raw flash, physical journal printout, ?) and try to
guarantee global consistency by waiting for the SQLite transaction to
complete. Like the user reading ?saving OK? and throwing away the
Post-It with the original information. Or (what we did) it could shut
off device power.

> There is no God's-eye view of application state.  The important service
> provided by the DBMS is not "what's committed is definitely saved", but
> rather that "what's committed is definitely *consistent*".

So when your application requires consistency of some broader scope, you
need the DBMS give you enough rope^h^h^h^h^h^h^h^h^h^h^hthe tools to
implement that yourself. Without a durability guarantee you're screwed.

The more frequent simpler usecases of course are not susceptible to that
and then indeed it doesn't matter.

S.M.
-- 
Dipl.-Phys. (Univ) Stefan Meinlschmidt, Senior Software Engineer
Am Wolfsmantel 46, 91058 Tennenlohe, Germany
Tel: +49-8458-3332-531  stefan.meinlschmidt at esolutions.de
Fax: +49-8458-3332-20-531


[sqlite] Recover corrupt database

2016-01-28 Thread Dan Kennedy
On 01/27/2016 08:47 PM, J Decker wrote:
> I was wondering if the backup api (sqlite3_backup_init, et al. ) will
> help to recover a corrupt DB?  All the other mechanisms are an export
> and re-import into a new DB...
>
> The database exists in a encrypted database that it's really difficult
> to get the keys outside of the program and I don't look forward to
> doing something that iterates through sqlite_master to get all the
> tables do all the selects and re-insert... so I browsed the API and
> found the _backup_ stuff.. but; I'm starting to think that it will
> just be a faithful page replication and the result will still be
> corrupted.

Most likely, yes.

As you have surmised, the backup API does a page-by-page copy.

Dan.