Re: [sqlite] A possible double bug?

2016-10-16 Thread Jens Alfke
I’d say the mistake here is converting a double to a string before inserting it 
into the database. Any time you convert between floating point and decimal (or 
vice versa) you can lose accuracy, and are not guaranteed round-trip fidelity.

(0.1, 0.01, 0.001, etc. do not have finite-length exact representations in 
binary, just like 1/7 doesn’t in decimal. So most non-integers that look 
reasonable in decimal are in fact subject to round-off errors in binary 
floating point.)

As I said earlier today about strings: don’t hardcode data values into SQL 
statements. Use bindings instead.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A possible double bug?

2016-10-16 Thread Keith Medcalf

On Sunday, 16 October, 2016 12:03, Victor Evertsson 
 wrote:

> I was wondering about the different behavior of inserting a Double as a
> String vs as a value with a prepare statement in C.
 
> Consider an example when the value: 62.027393 is inserted as a String and
> as value with a prepared statement, for instance:

The value 62.027393 cannot be represented exactly in IEEE 754 double precision 
floating point.  The bounding values are 62.027392645 and 
62.027393355.  The 1 ULP (epsilon) value is 7.105427357601e-15.
 
> "CREATE TABLE test (foo REAL)"
> "INSERT INTO test (foo) VALUES (?)"
> "INSERT INTO test (foo) VALUES (62.027393)"
> "SELECT * FROM test"

> If the content of the table test, is printed, then the output of the
> values is equal i.e. 62.027393. However, if the stored value is compared 
> with for instance a cross join:
 
> select * from test as a cross join test as b where a.foo = b.foo;
 
> Then two rows are returned which indicates that the values are not equal
> (four rows should be returned if they are equal).

They are equal for all intents and purposes.  Your comparison is simply too 
exacting, requiring the approximations to be "equal", whereas both values are 
valid approximations of 62.027393.
 
> If the value 62.0273934 is inserted as value with the prepare
> statement instead of 62.027393 in the example, then the insert as String
> and the insert as prepare statement is equal.
 
> The double seems to be changed from 62.027393 to 62.0273934 when
> inserted as a String. This happens with some other values too (but not
> all).

> The values should be equal and i wonder if this is a bug or intendent
> behavior?

http://floating-point-gui.de/errors/comparison/
https://en.wikipedia.org/wiki/IEEE_floating_point

It is neither a bug nor intended behaviour.  It is simply how binary floating 
point works.  When you compare floating point numbers, you need to compute the 
distance between them in epsilon units of the comparand.  If they are within a 
reasonable "distance" of each other, then they are equal.

For example, if abs((x-y)/epsilon(x)) < T then then the numbers are equal.  For 
non-pathological computations, a value of 5 for T is more than adequate.





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] A possible double bug?

2016-10-16 Thread Victor Evertsson
Hi,

I was wondering about the different behavior of inserting a Double as a
String vs as a value with a prepare statement in C.

Consider an example when the value: 62.027393 is inserted as a String and
as value with a prepared statement, for instance:

"CREATE TABLE test (foo REAL)"

"INSERT INTO test (foo) VALUES (?)"

"INSERT INTO test (foo) VALUES (62.027393)"

"SELECT * FROM test"


If the content of the table test, is printed, then the output of the values
is equal i.e. 62.027393. However, if the stored value is compared with for
instance a cross join:

select * from test as a cross join test as b where a.foo = b.foo;

Then two rows are returned which indicates that the values are not equal
(four rows should be returned if they are equal).


If the value 62.0273934 is inserted as value with the prepare
statement instead of 62.027393 in the example, then the insert as String
and the insert as prepare statement is equal.

The double seems to be changed from 62.027393 to 62.0273934 when
inserted as a String. This happens with some other values too (but not all).


The values should be equal and i wonder if this is a bug or intendent
behavior?


Best regards

Victor
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Cecil Westerhof
2016-10-16 21:05 GMT+02:00 Keith Medcalf :
>
> See https://www.sqlite.org/optoverview.html
> under section 10.0 Query Flattening
>
> Your query is:
>
> SELECT * FROM  WHERE 
>
> which could be treated as
>
> SELECT *
>   FROM (view select statement)
>  WHERE condition
>
> and then flattened.  Note however that the query WILL NOT be flattened 
> because of rule #2, the subselect in the FROM clause contains an aggregate ...

OK, I understand. The query is mostly run in a cron job. So I think I
go for the ‘expensive’ one, because that is more clear.


>> -Original Message-
>> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
>> On Behalf Of Jens Alfke
>> Sent: Sunday, 16 October, 2016 12:17
>> To: SQLite mailing list
>> Subject: Re: [sqlite] Why takes the second SELECT three times as much
>> time?
>>
>>
>> > On Oct 16, 2016, at 4:49 AM, Luuk  wrote:
>> >
>> > Because your second query has to build the complete view before it can
>> decide if a result is between the selected dates?
>>
>> I didn’t think a view had a physical manifestation that had to be built; I
>> thought it was just a shortcut/macro for a nested SELECT statement.
>> Or is the query optimizer not able to convert the nested SELECT into the
>> same form as the first query?
>>
>> —Jens
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Keith Medcalf

See https://www.sqlite.org/optoverview.html
under section 10.0 Query Flattening

Your query is:

SELECT * FROM  WHERE 

which could be treated as

SELECT *
  FROM (view select statement)
 WHERE condition

and then flattened.  Note however that the query WILL NOT be flattened because 
of rule #2, the subselect in the FROM clause contains an aggregate ...

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Jens Alfke
> Sent: Sunday, 16 October, 2016 12:17
> To: SQLite mailing list
> Subject: Re: [sqlite] Why takes the second SELECT three times as much
> time?
> 
> 
> > On Oct 16, 2016, at 4:49 AM, Luuk  wrote:
> >
> > Because your second query has to build the complete view before it can
> decide if a result is between the selected dates?
> 
> I didn’t think a view had a physical manifestation that had to be built; I
> thought it was just a shortcut/macro for a nested SELECT statement.
> Or is the query optimizer not able to convert the nested SELECT into the
> same form as the first query?
> 
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Simon Slavin

On 16 Oct 2016, at 7:17pm, Jens Alfke  wrote:

> I didn’t think a view had a physical manifestation that had to be built; I 
> thought it was just a shortcut/macro for a nested SELECT statement.
> Or is the query optimizer not able to convert the nested SELECT into the same 
> form as the first query?

The query optimizer cannot flatten all sub-SELECTs.  It always understands JOIN 
but the analysis needed for every possible SELECT combination is too much.

To figure out what it's doing use EXPLAIN QUERY PLAN.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Jens Alfke

> On Oct 16, 2016, at 4:49 AM, Luuk  wrote:
> 
> Because your second query has to build the complete view before it can decide 
> if a result is between the selected dates?

I didn’t think a view had a physical manifestation that had to be built; I 
thought it was just a shortcut/macro for a nested SELECT statement.
Or is the query optimizer not able to convert the nested SELECT into the same 
form as the first query?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Hebrew data in Sqlite db

2016-10-16 Thread Jens Alfke

> On Oct 15, 2016, at 9:26 AM, Vaibhav Shah  wrote:
> 
> I am using sqlite3.exe for bulk insertion in C#. I am facing issue when
> insert Hebrew data. As it contains double qoute(") as character and it does
> not support in insertion.

It’s almost always a bad idea to put variable string data directly into a SQL 
query, which is what it sounds like you’re doing. If you don’t follow the 
proper quoting rules, your code becomes vulnerable to SQL injection attacks. 
Even if you do quote correctly, you’re making SQLite parse and compile your 
query every time you run it, which is bad for performance.

Instead you should be using placeholders like “?” or “:name” in your query, 
compiling it once, then binding the values when you run it. This lets you pass 
the string in directly with no need to worry about quoting/escaping.

—Jens

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IN verses EXISTS Query Speed

2016-10-16 Thread Keith Medcalf

It depends on the data shape.

In the first case, using a correlated subquery, the outer table query is 
processed applying all applicable joins and where conditions, and if and only 
if the row is still a candidate is the correlated subquery performed.  If it 
succeeds the result row is output.  For maximum efficiency you need indexes 
covering (at least) the correlated columns.

In the second case, the subselect to generate the list for in is executed first 
and the results put in a temporary b-tree.  Then either (a) this list is 
traversed to find all the initial candidates in the outer query if the 
appropriate indexes exist, else the outer query is performed checking whether 
the outer candidate row is in the result set from the inner subquery.

Assuming that proper indexes exist the first form (correlated subquery) will 
almost always be quicker since it will probabilistically requires less 
operations in all cases.

In the case where tableb's size closely matches the set of candidates from the 
outer query, the two will execute at the same speed.  As the size of tableb 
increases (becomes less selective) the speed of the second form will decrease.

Take the case where tablea has a billion rows, and tableb has a billion rows, 
and the intersection is five rows.  Which WHERE condition would you expect to 
be quicker?  Now you may think that a given application will only ever have 5 
rows in tableb (or only has five rows during testing) and therefore the choice 
doesn't matter.  

Sometimes this is an incorrect assumption and this is what leads to the 
"billion dollars spent on a new system that does not work and must be thrown in 
the dust bin" stories in the news.

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Dave Blake
> Sent: Sunday, 16 October, 2016 00:58
> To: SQLite mailing list
> Subject: [sqlite] IN verses EXISTS Query Speed
> 
> Some simple testing is showing using an EXISTS statement is generally
> quicker then using an IN
> 
> e.g.
> 
> SELECT * FROM tablea
> WHERE EXISTS (SELECT 1 FROM tableb WHERE tablea.id  = tableb.id AND ...)
> 
> is quicker than
> SELECT * FROM tablea
> WHERE tablea.id IN  (SELECT tableb.id FROM tableb WHERE ...)
> 
> Is there any reason for this to be always true in SQLite, or is it query
> dependant?
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] IN verses EXISTS Query Speed

2016-10-16 Thread Clemens Ladisch
Dave Blake wrote:
> SELECT * FROM tablea
> WHERE EXISTS (SELECT 1 FROM tableb WHERE tablea.id  = tableb.id AND ...)
>
> is quicker than
> SELECT * FROM tablea
> WHERE tablea.id IN  (SELECT tableb.id FROM tableb WHERE ...)
>
> Is there any reason for this to be always true in SQLite, or is it query
> dependant?

I can imagine databases and queries where the second form would be faster.

But does the difference even matter?


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Is there a best practice for breaking up a large update?

2016-10-16 Thread Kevin O'Gorman
There are other writers, but they create new stuff, while this big update
wants to consolidate info about existing stuff.  There are also pure
readers.

Some of the docs on WAL logging had caveats that put me off.  Not knowing
exactly what I was going to be doing with the database, I could not be sure
if the caveats applied to me or not.  Particularly because I had some
trouble understanding the caveats, since some of the terminology is a bit
new to me.  Not all of it -- I actually taught a database course at the
undergrad level once, SQL-based, but it was quite a while ago.  But most of
my database experience was either following SQL recipes, And a long time
ago I wrote a full database server from scratch (not relational --
heirarchical) in assembler.

On Sat, Oct 15, 2016 at 6:14 PM, Darren Duncan 
wrote:

> You didn't say if the other tasks need write access to the database or if
> it is just read-only.  If the others only need read-only, let them access a
> copy of the database while you make your changes in another copy, then just
> swap the databases when done. -- Darren Duncan
>
>
> On 2016-10-15 1:21 PM, Kevin O'Gorman wrote:
>
>> I'm new to this, and working in Python's sqlite3.  So be patient, and
>> don't
>> expect me to know too much.  This is also a personal hobby, so there's
>> nobody else for me to ask.
>>
>> I've got a database of a some tens of millions of positions in a board
>> game.  It may be over a billion before I'm done (working in an 11-TB
>> partition at the moment.) I want to process a major slice of these
>> records,
>> in order to update other records.  I might want to break the work up
>> into chunks to allow other access to the database while this is going on.
>>
>> So I have some questions:
>> (1) If I do all of my updates to a temporary table, does the database
>> still
>> get locked?
>>
>> (2) Is there another way to keep it available?  It happens for this
>> activity that consistency is not at risk.
>>
>> (3) If it turns out that I can avoid locking, it there still a performance
>> reason to break the transaction into chunks, or would I be as well off
>> doing it as a single transaction (assuming I have disk space for the
>> journal).
>>
>> (4) If I break it up into chunks, I can think of several ways to do that
>> and keep track of what's been done and what has not.  Is there a best
>> practice for this?
>>
>> (5) Is there a forum specifically for folks doing Python database
>> programming?  It occurs to me that much of what I'm asking about is not
>> specific to SQLite.  But my code, for now at least, is going to be in
>> Python because it's the easiest way I can see, and my own efforts are the
>> real bottleneck in making progress.
>>
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Luuk



On 16-10-16 12:00, Cecil Westerhof wrote:

I have defined the following table:
 CREATE TABLE messages (
 dateTEXT NOT NULL DEFAULT CURRENT_DATE,
 timeTEXT NOT NULL DEFAULT CURRENT_TIME,
 typeTEXT NOT NULL,
 messageTEXT NOT NULL,

 PRIMARY KEY (date, time, type)
 );
 CREATE INDEX messages_date ON messages(date);
 CREATE INDEX messages_time ON messages(time);
 CREATE INDEX messages_type ON messages(type);

And the following view:
 CREATE VIEW temperatureStatistics AS
 SELECT   date AS Date
 ,MIN(message) AS Minimum
 ,MAX(message) AS Maximum
 ,AVG(message) AS Average
 ,COUNT(*) AS Count
 FROM messages
 WHEREtype = 'cpu-temp'
 GROUP BY date

I also created the folowing two queries:
 SELECT   date AS Date
 ,MIN(message) AS Minimum
 ,MAX(message) AS Maximum
 ,AVG(message) AS Average
 ,COUNT(*) AS Count
 FROM messages
 WHEREtype = 'cpu-temp'
  AND date BETWEEN (SELECT date('now', '-7 day'))
   AND (SELECT date('now', '-1 day'))
 GROUP BY date
 ORDER BY date DESC

and:
 SELECT   *
 FROM temperatureStatistics
 WHEREdate BETWEEN (SELECT date('now', '-7 day'))
   AND (SELECT date('now', '-1 day'))
 ORDER BY date DESC

But the first one is about three times as fast as the second one. What
am I doing wrong here?



Because your second query has to build the complete view before it can 
decide if a result is between the selected dates?

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding Hebrew data in Sqlite db

2016-10-16 Thread Simon Slavin

On 15 Oct 2016, at 5:26pm, Vaibhav Shah  wrote:

> I am using sqlite3.exe for bulk insertion in C#. I am facing issue when
> insert Hebrew data. As it contains double qoute(") as character and it does
> not support in insertion.

Dear Vaibhav,

Happy to help if you can demonstrate the problem but I think you are 
surrounding your text with double-quote instead of the single-quote character 
SQLite expects.  In SQLite text is delimited using the single quote character 
(').  The double-quote character is treated like any other character and no 
special processing is done for it.

SQLite version 3.14.0 2016-07-26 15:17:14
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE MyTable (myVar TEXT);
sqlite> INSERT INTO MyTable VALUES ('hello world');
sqlite> INSERT INTO MyTable VALUES ('hello " double-quote');
sqlite> SELECT * FROM MyTable;
hello world
hello " double-quote
sqlite> 

But I don't think you are really using that character at all.  I think you want 
the gershayim character/accent which looks like double-quote but isn't it.  I 
tried the two I found:

U+059E  D6 9E   HEBREW ACCENT GERSHAYIM
U+05F4  D7 B4   HEBREW PUNCTUATION GERSHAYIM

Again, these work without problems in SQLite.

sqlite> INSERT INTO MyTable VALUES ('hello ֞֞ gershayim accent');
sqlite> INSERT INTO MyTable VALUES ('hello ״ gershayim character');

sqlite> SELECT * FROM MyTable;
hello world
hello " double-quote
hello ֞֞ gershayim accent
hello ״ gershayim character

These tests performed on a Mac but I don't see why any other OS should do 
anything different.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Inserting a new record (anywhere) into a table of ordered records that have an integer auto-increment primary key

2016-10-16 Thread Chris Locke
But be careful, as you can't change all records from 3 to 4 and then 4 to
5, as the 4 to 5 will contain the records you've just moved from 3 to 4
Canofworms.jpg. ;)

Thanks,
Chris

On 15 Oct 2016 5:46 p.m., "Richard Damon"  wrote:

> On 10/15/16 12:15 PM, Simon Slavin wrote:
>
>> On 14 Oct 2016, at 2:29pm, Thom Wharton 
>> wrote:
>>
>> I want to be able to programmatically insert a new record anywhere in
>>> that table. Let's suppose I want to create a new record between the records
>>> whose ID are 2 and 3.  This new record would need to take the ID of 3, and
>>> all subsequent records would need to have their primary keys updated.
>>>
>>> Is there a way to do this automagically (like a specialized INSERT
>>> command?) in Sqlite?
>>>
>> It's a silly thing to want to do since it makes a nonsense of what IDs
>> are for.  I think you need to rethink what you're trying to do by changing
>> existing IDs.
>>
>> However, if you really need to do it ...
>>
>> UPDATE MyTable SET ID = ID + 1 WHERE ID >= 3;
>> INSERT INTO MyTable ...
>>
>> Simon.
>>
>> And, if any other table refers to records in that table via that ID
> field, THEY need to be changed too, all in an 'atomic' transaction.
>
>
> --
> Richard Damon
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Regarding Hebrew data in Sqlite db

2016-10-16 Thread Vaibhav Shah
Hello team,

I am using sqlite3.exe for bulk insertion in C#. I am facing issue when
insert Hebrew data. As it contains double qoute(") as character and it does
not support in insertion. After surfing, I came to know that I have to do
double it and append same as suffix and prefix of word then it will let you
insert. It is working but first column of first row is getting insert as
zero when that table have double qoute character.

Please help with this. And also for insertion double qoute if any easy
solution there for it as I have to handle that with code and have to
traverse whole file. Thank you.

Regards,
Vaibhav Shah
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Why takes the second SELECT three times as much time?

2016-10-16 Thread Cecil Westerhof
I have defined the following table:
CREATE TABLE messages (
dateTEXT NOT NULL DEFAULT CURRENT_DATE,
timeTEXT NOT NULL DEFAULT CURRENT_TIME,
typeTEXT NOT NULL,
messageTEXT NOT NULL,

PRIMARY KEY (date, time, type)
);
CREATE INDEX messages_date ON messages(date);
CREATE INDEX messages_time ON messages(time);
CREATE INDEX messages_type ON messages(type);

And the following view:
CREATE VIEW temperatureStatistics AS
SELECT   date AS Date
,MIN(message) AS Minimum
,MAX(message) AS Maximum
,AVG(message) AS Average
,COUNT(*) AS Count
FROM messages
WHEREtype = 'cpu-temp'
GROUP BY date

I also created the folowing two queries:
SELECT   date AS Date
,MIN(message) AS Minimum
,MAX(message) AS Maximum
,AVG(message) AS Average
,COUNT(*) AS Count
FROM messages
WHEREtype = 'cpu-temp'
 AND date BETWEEN (SELECT date('now', '-7 day'))
  AND (SELECT date('now', '-1 day'))
GROUP BY date
ORDER BY date DESC

and:
SELECT   *
FROM temperatureStatistics
WHEREdate BETWEEN (SELECT date('now', '-7 day'))
  AND (SELECT date('now', '-1 day'))
ORDER BY date DESC

But the first one is about three times as fast as the second one. What
am I doing wrong here?

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users