Re: [sqlite] Database speed comparison

2020-01-10 Thread Andy
I had sqlite3_prepare_v2 without sqlite3_finalize and sqlite3_close not
closes multiple databases.

sob., 11 sty 2020 o 07:17 Andy  napisał(a):

> These tests are a lot faster than calling Sqlite in my program, I must use
> the same convention: for example first prepare, next sqlite3_bind_int64,
> sqlite3_bind_text and sqlite3_step. My calling was very not optimal.
>
> pt., 10 sty 2020 o 20:13 Simon Slavin  napisał(a):
>
>> Are you aware of
>>
>> 
>>
>> ?  Now all you have to do is write comparative tests for other SQL
>> implementations.  However, from what I've seen, realistic speed tests tend
>> to come down to cache sizes.
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database speed comparison

2020-01-10 Thread Andy
These tests are a lot faster than calling Sqlite in my program, I must use
the same convention: for example first prepare, next sqlite3_bind_int64,
sqlite3_bind_text and sqlite3_step. My calling was very not optimal.

pt., 10 sty 2020 o 20:13 Simon Slavin  napisał(a):

> Are you aware of
>
> 
>
> ?  Now all you have to do is write comparative tests for other SQL
> implementations.  However, from what I've seen, realistic speed tests tend
> to come down to cache sizes.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_GROUP_ARRAY unexpected misuse error in UPDATE

2020-01-10 Thread Keith Medcalf

And, of course, if you want your JSON array to be in MRU order, you can use 
this:

update users as O
   set urls = (
   select json_group_array(distinct value)
 from (
   select ?2 as value
  union all
   select value
 from json_each(urls)
 join users
where userid = O.userid
  )
  )
 where userid == ?1
   and ?2 is not null;

or

update users as O
   set urls = (
   select json_group_array(value)
 from (
   select ?2 as value
  union all
   select value
 from json_each(urls)
 join users
where userid = O.userid
  and value != ?2
  )
  )
 where userid == ?1
   and ?2 is not null;

or for LRU order this:

update users as O
   set urls = (
   select json_group_array(value)
 from (
   select value
 from json_each(urls)
 join users
where userid = O.userid
  and value != ?2
  union all
   select ?2 as value
  )
  )
 where userid == ?1
   and ?2 is not null;

The advantage of course is that you can specify a collation such as nocase for 
the "value != ?2 collate nocase" so that HttP://WwW.GooGle.Com is the same as 
http://www.google.com without having to normalcase all your URLs first ...

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Keith Medcalf
>Sent: Friday, 10 January, 2020 18:07
>To: SQLite mailing list 
>Subject: Re: [sqlite] JSON_GROUP_ARRAY unexpected misuse error in UPDATE
>
>
>On Friday, 10 January, 2020 14:35, Jason Dora  wrote:
>
>>I have a workflow where I would like to push an item onto a JSON array,
>>while ensuring the items on the array are unique. And I'm able to write
>a
>>working statement in a SELECT, but the same logic fails in a UPDATE.
>
>You need to define what you mean by "push an item onto a JSON array".  Do
>you want the array to be ordered by insertion order or merely contain
>sorted distinct entries?
>
>>Assume there is a table named "users" with the columns UserId and URLs.
>>UserId being the primary key and all URLs values being well formatted
>>JSON
>>e.g. [], ["http://google.com;], etc.
>
>>Assume then I want to add "http://foobar.com; to the URLs for UserId 1.
>
>>This SELECT will return the expected value:
>>SELECT JSON_GROUP_ARRAY((SELECT value FROM
>>JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
>>http://foobar.com')) GROUP BY value)) AS URLs FROM users WHERE UserId=1;
>
>This does not appear to actually do what you want ... at least not when I
>execute it with test data ...
>
>>Trying to UPDATE using the same pattern generates a "misuse of aggregate
>>function" error:
>>UPDATE users SET URLs=JSON_GROUP_ARRAY((SELECT value FROM
>>JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
>>http://foobar.com')) GROUP BY value)) WHERE UserId=1;
>
>That is a correct error message.
>
>JSON_GROUP_ARRAY is an aggregate function applicable to select ... it is
>not a scalar function.  It works exactly the same way all the time.
>
>The following query appends a value to the end of the array if it is not
>already in the array ...
>
>select json_group_array(distinct value)
>  from (
>select value
>  from (
>select value
>  from json_each(urls)
>  join users
> where userid=?
>)
>   union all
>values (?)
>   );
>
>and the corresponding update statement to add an arbitrary value bound as
>parameter 2 to the userid bound as parameter 1 would be:
>
>update users as O
>   set urls = (
>   select json_group_array(distinct value)
> from (
>   select value
> from (
>   select value
> from json_each(urls)
> join users
>where userid = O.userid
>   )
>  union all
>   values (?2)
>  )
>  )
> where userid == ?1
>   and ?2 is not null;
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>
>
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org

Re: [sqlite] JSON_GROUP_ARRAY unexpected misuse error in UPDATE

2020-01-10 Thread Keith Medcalf

On Friday, 10 January, 2020 14:35, Jason Dora  wrote:

>I have a workflow where I would like to push an item onto a JSON array,
>while ensuring the items on the array are unique. And I'm able to write a
>working statement in a SELECT, but the same logic fails in a UPDATE.

You need to define what you mean by "push an item onto a JSON array".  Do you 
want the array to be ordered by insertion order or merely contain sorted 
distinct entries?

>Assume there is a table named "users" with the columns UserId and URLs.
>UserId being the primary key and all URLs values being well formatted
>JSON
>e.g. [], ["http://google.com;], etc.

>Assume then I want to add "http://foobar.com; to the URLs for UserId 1.

>This SELECT will return the expected value:
>SELECT JSON_GROUP_ARRAY((SELECT value FROM
>JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
>http://foobar.com')) GROUP BY value)) AS URLs FROM users WHERE UserId=1;

This does not appear to actually do what you want ... at least not when I 
execute it with test data ...

>Trying to UPDATE using the same pattern generates a "misuse of aggregate
>function" error:
>UPDATE users SET URLs=JSON_GROUP_ARRAY((SELECT value FROM
>JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
>http://foobar.com')) GROUP BY value)) WHERE UserId=1;

That is a correct error message.

JSON_GROUP_ARRAY is an aggregate function applicable to select ... it is not a 
scalar function.  It works exactly the same way all the time.

The following query appends a value to the end of the array if it is not 
already in the array ...

select json_group_array(distinct value)
  from (
select value
  from (
select value
  from json_each(urls)
  join users
 where userid=?
)
   union all
values (?)
   );

and the corresponding update statement to add an arbitrary value bound as 
parameter 2 to the userid bound as parameter 1 would be:

update users as O
   set urls = (
   select json_group_array(distinct value)
 from (
   select value
 from (
   select value
 from json_each(urls)
 join users
where userid = O.userid
   )
  union all
   values (?2)
  )
  )
 where userid == ?1
   and ?2 is not null;

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] TRUSTED_SCHEMA and .param init in CLI

2020-01-10 Thread Keith Medcalf

The .param commands seem to have difficulty with the SQLITE_DEFAULT_DEFENSIVE.  
If you define it as in #define SQLITE_DEFAULT_DEFENSIVE (to either 1 or 0) then 
the .param init does not create the temp.sqlite_parameters table.  pragma 
trusted_schema=1; then allows the .param init to work, but you cannot set 
parameter values.

If SQLITE_DEFAULT_DEFENSIVE is undefined, then .param works normally even after 
pragma trusted_schema=0;

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JSON_GROUP_ARRAY unexpected misuse error in UPDATE

2020-01-10 Thread Jason Dora
Hello SQLite masters,

I have a workflow where I would like to push an item onto a JSON array,
while ensuring the items on the array are unique. And I'm able to write a
working statement in a SELECT, but the same logic fails in a UPDATE.

Assume there is a table named "users" with the columns UserId and URLs.
UserId being the primary key and all URLs values being well formatted JSON
e.g. [], ["http://google.com;], etc.

Assume then I want to add "http://foobar.com; to the URLs for UserId 1.

This SELECT will return the expected value:
SELECT JSON_GROUP_ARRAY((SELECT value FROM
JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
http://foobar.com')) GROUP BY value)) AS URLs FROM users WHERE UserId=1;

Trying to UPDATE using the same pattern generates a "misuse of aggregate
function" error:
UPDATE users SET URLs=JSON_GROUP_ARRAY((SELECT value FROM
JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
http://foobar.com')) GROUP BY value)) WHERE UserId=1;

However, an additional INNER SELECT in the UPDATE will result in the
expected behavior:
UPDATE users SET URLs=(SELECT JSON_GROUP_ARRAY(value) FROM (SELECT value
FROM JSON_EACH(JSON_SET(URLs,'$['||JSON_ARRAY_LENGTH(URLs)||']','
http://foobar.com')) GROUP BY value)) WHERE UserId=1;

Since my expectation is for the JSON_GROUP_ARRAY function to behave the
same for SELECTs and UPDATEs. I think the above may be a SQLITE error and
am reporting it.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database speed comparison

2020-01-10 Thread Edward Lau
I would be nice to have that page updated and contrast the old and new 
performance of SQLite.


-Original Message-
From: Cecil Westerhof 
To: SQLite mailing list 
Sent: Fri, Jan 10, 2020 10:37 am
Subject: [sqlite] Database speed comparison

I ran across this page:
    https://sqlite.org/speed.html

It is a 'bit' outdated. (It uses 2.7.6.) Where can I find the scripts? And
would it be a lot of work to update them and regularly rerun them? If it is
manageable I would not mind to provide regularly updates.

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Richard Damon

On 1/10/20 2:24 PM, Tim Streater wrote:

On 10 Jan 2020, at 18:55, Keith Medcalf  wrote:


On Friday, 10 January, 2020 11:44, Tim Streater  wrote:


On 10 Jan 2020, at 18:03, Richard Hipp  wrote:

On 1/10/20, Dominique Devienne  wrote:

There's no way at all, to know the length of a text column with
embedded NULLs?

You can find the true length of a string in bytes from C-code using
the sqlite3_column_bytes() interface. But I cannot, off-hand, think
of a way to do that from SQL.

But if I store UTF-8 in a TEXT column, surely I'm allowed to include
NULLs in that? They are after all valid UTF-8 characters.

No, they are not. The "NUL character" in Modified UTF-8 is the two-byte
sequence 0xC0 0x80. This is specifically so that 0x00 can be used as a string
terminator. Validly encoded UTF-8 encoded text stored in a C String (0x00
terminated sequence of bytes) must not contain an embedded 0x00 byte since
that byte terminates the sequence.

Nice, but Wikipedia has that as a "Derivative" and "incompatible with the UTF-8 
specification and may be rejected by conforming UTF-8 applications." It appears (though I may 
have missed it) not to be mentioned on this handy site either:

https://www.utf8-chartable.de/unicode-utf8-table.pl

I shall have to check what my preferred language's wrapper does.


It is incompatible, in the sense that it uses an encoding that the UTF-8 
specification says in invalid, and thus an application that performs 
fully all the tests on valid data forms would reject it. In many ways it 
is a compatible extension in that excluding the test that specifically 
makes the form invalid, doing the processing by the general rules of 
UTF-8, gives the expected result.


C Strings do not allow 0 bytes in them. This would normally mean that 
they do not allow the NUL character to be in a string. This extension 
allows a character which would be interpreted as the NUL character to be 
represented without needing a 0 byte.


It should be pointed out that most libraries won't be checking all the 
strings that pass through them to see if they violate the rule, as that 
is just adding a lot of overhead for very little benefit. It is really 
expected that applications will do this sort of test at the borders, 
when possibly untrusted strings come in, and know that if good strings 
come in, the following processing will keep the strings valid.


--
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Tim Streater
On 10 Jan 2020, at 18:55, Keith Medcalf  wrote:

> On Friday, 10 January, 2020 11:44, Tim Streater  wrote:
>
>>On 10 Jan 2020, at 18:03, Richard Hipp  wrote:
>
>>> On 1/10/20, Dominique Devienne  wrote:
 There's no way at all, to know the length of a text column with
 embedded NULLs?
>
>>> You can find the true length of a string in bytes from C-code using
>>> the sqlite3_column_bytes() interface. But I cannot, off-hand, think
>>> of a way to do that from SQL.
>
>>But if I store UTF-8 in a TEXT column, surely I'm allowed to include
>>NULLs in that? They are after all valid UTF-8 characters.
>
> No, they are not. The "NUL character" in Modified UTF-8 is the two-byte
> sequence 0xC0 0x80. This is specifically so that 0x00 can be used as a string
> terminator. Validly encoded UTF-8 encoded text stored in a C String (0x00
> terminated sequence of bytes) must not contain an embedded 0x00 byte since
> that byte terminates the sequence.

Nice, but Wikipedia has that as a "Derivative" and "incompatible with the UTF-8 
specification and may be rejected by conforming UTF-8 applications." It appears 
(though I may have missed it) not to be mentioned on this handy site either:

https://www.utf8-chartable.de/unicode-utf8-table.pl

I shall have to check what my preferred language's wrapper does.


-- 
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database speed comparison

2020-01-10 Thread Simon Slavin
Are you aware of



?  Now all you have to do is write comparative tests for other SQL 
implementations.  However, from what I've seen, realistic speed tests tend to 
come down to cache sizes.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Richard Damon

On 1/10/20 1:43 PM, Tim Streater wrote:

On 10 Jan 2020, at 18:03, Richard Hipp  wrote:


On 1/10/20, Dominique Devienne  wrote:

There's no way at all, to know the length of a text column with embedded
NULLs?


You can find the true length of a string in bytes from C-code using
the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
of a way to do that from SQL.

But if I store UTF-8 in a TEXT column, surely I'm allowed to include NULLs in 
that? They are after all valid UTF-8 characters.



As has been said, C Strings (which is what TEXT is assumed to hold) are 
not allowed to include null characters, but are assumed to terminate at 
the first 0 bytes. There is a variant of UTF-8, called modified UTF-8 or 
MUTF-8, which allows a null character to be encoded as C0 80, which does 
decode to 0 by the base UTF-8 rules, but is disallowed by the minimum 
encoding rule, which can be used to embed nulls in strings if the system 
doesn't enforce the minimum length encoding rule (at least for this 
character).


I have no idea if that would work with SQLite though.

--
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Database speed comparison

2020-01-10 Thread Andy
I like to compare these tests on my computer. My first attempt to Sqlite
works but a lot of slower and I don't know why, maybe due to inserting
blobs? But blobs are small, about 10 kB.

pt., 10 sty 2020 o 19:37 Cecil Westerhof 
napisał(a):

> I ran across this page:
> https://sqlite.org/speed.html
>
> It is a 'bit' outdated. (It uses 2.7.6.) Where can I find the scripts? And
> would it be a lot of work to update them and regularly rerun them? If it is
> manageable I would not mind to provide regularly updates.
>
> --
> Cecil Westerhof
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Keith Medcalf

On Friday, 10 January, 2020 11:44, Tim Streater  wrote:

>On 10 Jan 2020, at 18:03, Richard Hipp  wrote:

>> On 1/10/20, Dominique Devienne  wrote:
>>> There's no way at all, to know the length of a text column with
>>> embedded NULLs?

>> You can find the true length of a string in bytes from C-code using
>> the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
>> of a way to do that from SQL.

>But if I store UTF-8 in a TEXT column, surely I'm allowed to include
>NULLs in that? They are after all valid UTF-8 characters.

No, they are not.  The "NUL character" in Modified UTF-8 is the two-byte 
sequence 0xC0 0x80.  This is specifically so that 0x00 can be used as a string 
terminator.  Validly encoded UTF-8 encoded text stored in a C String (0x00 
terminated sequence of bytes) must not contain an embedded 0x00 byte since that 
byte terminates the sequence.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Tim Streater
On 10 Jan 2020, at 18:03, Richard Hipp  wrote:

> On 1/10/20, Dominique Devienne  wrote:
>>
>> There's no way at all, to know the length of a text column with embedded
>> NULLs?
>>
>
> You can find the true length of a string in bytes from C-code using
> the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
> of a way to do that from SQL.

But if I store UTF-8 in a TEXT column, surely I'm allowed to include NULLs in 
that? They are after all valid UTF-8 characters.



-- 
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database speed comparison

2020-01-10 Thread Cecil Westerhof
I ran across this page:
https://sqlite.org/speed.html

It is a 'bit' outdated. (It uses 2.7.6.) Where can I find the scripts? And
would it be a lot of work to update them and regularly rerun them? If it is
manageable I would not mind to provide regularly updates.

-- 
Cecil Westerhof
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Keith Medcalf

On Friday, 10 January, 2020 10:50, Dominique Devienne :

>On Fri, Jan 10, 2020 at 4:30 PM Richard Hipp  wrote:

>> length() on a BLOB should show the number of bytes in the BLOB.

>> length() on a string should show the number of *characters* (not
>> bytes) in the string up through but not including the first
>> zero-character.  It is possible to have additional content after the
>> first zero-character in a string, which length() will not tell you
>> about.

> There's no way at all, to know the length of a text column with embedded
> NULLs?

C-Strings cannot have embedded nulls.  The very definition of a C-String is "a 
sequence of non-zero characters followed by a zero character".  So while you 
can store and retrieve invalid C-Strings in the database (as in use BIND and 
COLUMN_TEXT), "things" (meaning software) which expects a C-String to be a 
C-String will be confused by such improper usage of a C String, and "things" 
which expect "text" fields to contain properly encoded C-Strings are likely to 
have brain-seizures.

You can, of course, cast the column as a blob (which IS allowed to have 
embedded nulls and DOES NOT have to have valid text encoding), and get the 
length of that bag-o-bytes (in bytes):

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279
sqlite> select typeof(id_local),typeof(lutfullstring),typeof(luthash) from 
copied;
integer|text|text
integer|text|text
integer|text|text
sqlite> select length(cast(lutfullstring as blob)) from copied;
194238
183050
193908

The first select is indeed selecting the ENTIRE string that has been asked for. 
 It is then using printf "%s" to print it because it is SUPPOSED TO BE a null 
terminated string.  It is doing exactly what it has been told to do.  
Similarly, the function LENGTH() on what is supposed to be a NULL terminated 
string returns the number of characters up to but not including the NULL 
terminator.

The real problem here is that a BLOB has been stored as if it were TEXT.  This 
is a failure of the application to properly sanitize its input.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Richard Hipp
On 1/10/20, Dominique Devienne  wrote:
>
> There's no way at all, to know the length of a text column with embedded
> NULLs?
>

You can find the true length of a string in bytes from C-code using
the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
of a way to do that from SQL.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Dominique Devienne
On Fri, Jan 10, 2020 at 4:30 PM Richard Hipp  wrote:
> length() on a BLOB should show the number of bytes in the BLOB.
>
> length() on a string should show the number of *characters* (not
> bytes) in the string up through but not including the first
> zero-character.  It is possible to have additional content after the
> first zero-character in a string, which length() will not tell you about.

Hi Richard,

There's no way at all, to know the length of a text column with embedded NULLs?

--DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sharing an SQLite database across containers

2020-01-10 Thread Simon Slavin


" Just exactly how is a file under constant modification safely shared across 
dozens of containers, all needing concurrent access? And you’re really going to 
read this file from completely different pieces of software?

[...] does a database on a Docker shared volume in multiple containers even 
work? Why yes it does. Check. Does this work at all under load? Wait, let me 
turn on WAL mode. Now it does. Check. Do processes block each other under load? 
No! Check. "

I don't know enough about containers to understand this article, but I'm sure 
some of you do.  Though you might like to see it.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Simon Slavin
On 10 Jan 2020, at 3:11pm, Ryan Mack  wrote:

> OK, I think I've got a better sense now. Hex encoding the column shows that 
> there's actually a huge amount of data stored in there. For some reason 
> length() isn't revealing it even if the column type is blob. Dumping and 
> restoring the table is truncating the data.

Is there a chance that some part of the software thinks that 0x00 is a 
terminator for BLOBs, and some other part ignores the 0x00 and abides strictly 
by the length ?  I've seen that before.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Richard Hipp
On 1/10/20, Ryan Mack  wrote:
> OK, I think I've got a better sense now. Hex encoding the column shows
> that there's actually a huge amount of data stored in there. For some
> reason length() isn't revealing it even if the column type is blob.
> Dumping and restoring the table is truncating the data.

length() on a BLOB should show the number of bytes in the BLOB.

length() on a string should show the number of *characters* (not
bytes) in the string up through but not including the first
zero-character.  It is possible to have additional content after the
first zero-character in a string, which length() will not tell you
about.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Ryan Mack
OK, I think I've got a better sense now. Hex encoding the column shows
that there's actually a huge amount of data stored in there. For some
reason length() isn't revealing it even if the column type is blob.
Dumping and restoring the table is truncating the data.

On Fri, Jan 10, 2020 at 9:58 AM Ryan Mack  wrote:
>
> Accidentally sent my first reply direct, responding to the list. I'm
> now wondering if there's a lot of binary data hidden in each row.
> Trying to figure out how to determine that if length() doesn't show
> anything.
>
>
> Prior response:
>
> An excellent idea, thank you :-) .  The output is included below.
>
> I am continuing to do my own debugging in parallel. I am now leaning
> towards a new hypothesis that there may be a bug handling
> strange/malformed unicode that is resulting in a large amount of
> garbage data being stored into the record. I'm trying to figure out
> the exact conditions of copying/loading data into the table that
> causes the issue to propagate or disappear.
>
>
>  % ./sqlite3_analyzer ~/Downloads/200k-per-row.sqlite
> /** Disk-Space Utilization Report For 
> /Users/nadia/Downloads/200k-per-row.sqlite
>
> Page size in bytes 4096
> Pages in the whole file (measured) 143
> Pages in the whole file (calculated).. 143
> Pages that store data. 143100.0%
> Pages on the freelist (per header) 00.0%
> Pages on the freelist (calculated) 00.0%
> Pages of auto-vacuum overhead. 00.0%
> Number of tables in the database.. 2
> Number of indices. 0
> Number of defined indices. 0
> Number of implied indices. 0
> Size of the file in bytes. 585728
> Bytes of user payload stored.. 571322  97.5%
>
> *** Page counts for all tables with their indices 
> *
>
> COPIED 142 99.30%
> SQLITE_MASTER. 10.70%
>
> *** Page counts for all tables and indices separately 
> *
>
> COPIED 142 99.30%
> SQLITE_MASTER. 10.70%
>
> *** All tables 
> 
>
> Percentage of total database.. 100.0%
> Number of entries. 4
> Bytes of storage consumed. 585728
> Bytes of payload.. 571412  97.6%
> Bytes of metadata. 1284 0.22%
> Average payload per entry. 142853.00
> Average unused bytes per entry 3393.00
> Average metadata per entry 321.00
> Average fanout 3.00
> Maximum payload per entry. 194280
> Entries that use overflow. 3   75.0%
> Index pages used.. 1
> Primary pages used 4
> Overflow pages used... 138
> Total pages used.. 143
> Unused bytes on index pages... 407099.37%
> Unused bytes on primary pages. 950258.0%
> Unused bytes on overflow pages 00.0%
> Unused bytes on all pages. 135722.3%
>
> *** Table COPIED 
> **
>
> Percentage of total database..  99.30%
> Number of entries. 3
> Bytes of storage consumed. 581632
> Bytes of payload.. 571322  98.2%
> Bytes of metadata. 1172 0.20%
> B-tree depth.. 2
> Average payload per entry. 190440.67
> Average unused bytes per entry 3226.00
> Average metadata per entry 390.67
> Average fanout 3.00
> Non-sequential pages.. 00.0%
> Maximum payload per entry. 194280
> Entries that use overflow. 3  100.0%
> Index pages used.. 1
> Primary pages used 3
> Overflow pages used... 138
> Total pages used.. 142
> Unused bytes on index pages... 407099.37%
> Unused bytes on primary pages. 560845.6%

Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Ryan Mack
Accidentally sent my first reply direct, responding to the list. I'm
now wondering if there's a lot of binary data hidden in each row.
Trying to figure out how to determine that if length() doesn't show
anything.


Prior response:

An excellent idea, thank you :-) .  The output is included below.

I am continuing to do my own debugging in parallel. I am now leaning
towards a new hypothesis that there may be a bug handling
strange/malformed unicode that is resulting in a large amount of
garbage data being stored into the record. I'm trying to figure out
the exact conditions of copying/loading data into the table that
causes the issue to propagate or disappear.


 % ./sqlite3_analyzer ~/Downloads/200k-per-row.sqlite
/** Disk-Space Utilization Report For /Users/nadia/Downloads/200k-per-row.sqlite

Page size in bytes 4096
Pages in the whole file (measured) 143
Pages in the whole file (calculated).. 143
Pages that store data. 143100.0%
Pages on the freelist (per header) 00.0%
Pages on the freelist (calculated) 00.0%
Pages of auto-vacuum overhead. 00.0%
Number of tables in the database.. 2
Number of indices. 0
Number of defined indices. 0
Number of implied indices. 0
Size of the file in bytes. 585728
Bytes of user payload stored.. 571322  97.5%

*** Page counts for all tables with their indices *

COPIED 142 99.30%
SQLITE_MASTER. 10.70%

*** Page counts for all tables and indices separately *

COPIED 142 99.30%
SQLITE_MASTER. 10.70%

*** All tables 

Percentage of total database.. 100.0%
Number of entries. 4
Bytes of storage consumed. 585728
Bytes of payload.. 571412  97.6%
Bytes of metadata. 1284 0.22%
Average payload per entry. 142853.00
Average unused bytes per entry 3393.00
Average metadata per entry 321.00
Average fanout 3.00
Maximum payload per entry. 194280
Entries that use overflow. 3   75.0%
Index pages used.. 1
Primary pages used 4
Overflow pages used... 138
Total pages used.. 143
Unused bytes on index pages... 407099.37%
Unused bytes on primary pages. 950258.0%
Unused bytes on overflow pages 00.0%
Unused bytes on all pages. 135722.3%

*** Table COPIED **

Percentage of total database..  99.30%
Number of entries. 3
Bytes of storage consumed. 581632
Bytes of payload.. 571322  98.2%
Bytes of metadata. 1172 0.20%
B-tree depth.. 2
Average payload per entry. 190440.67
Average unused bytes per entry 3226.00
Average metadata per entry 390.67
Average fanout 3.00
Non-sequential pages.. 00.0%
Maximum payload per entry. 194280
Entries that use overflow. 3  100.0%
Index pages used.. 1
Primary pages used 3
Overflow pages used... 138
Total pages used.. 142
Unused bytes on index pages... 407099.37%
Unused bytes on primary pages. 560845.6%
Unused bytes on overflow pages 00.0%
Unused bytes on all pages. 9678 1.7%

*** Table SQLITE_MASTER ***

Percentage of total database..   0.70%
Number of entries. 1
Bytes of storage consumed. 4096
Bytes of payload.. 90   2.2%
Bytes of 

Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Richard Hipp
On 1/10/20, Ryan Mack  wrote:
>
> I'm trying to understand unexplained table bloat

The sqlite3_analyzer command-line utility program (available  in the
"Precompiled binaries" bundles on the https://sqlite.org/download.html
page) is designed to help understand these kinds of problems.  Please
run that utility on the database and perhaps post the output here.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Ryan Mack
Thank you, I was unaware of the integrity_check pragma. It returns OK
for the database in question.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread Simon Slavin

On 10 Jan 2020, at 2:06pm, David Raymond  wrote:

> Well something's weird anyway. When I open it with the command line tool it 
> queries it just fine

Did you run an integrity_check on the database ?  It looks from your posts as 
if it's corrupt.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-10 Thread David Raymond
Well something's weird anyway. When I open it with the command line tool it 
queries it just fine. I tried to open it in Python to check all the characters 
in the strings and got this:

Traceback (most recent call last):
  File "...\Testing4.py", line 8, in 
cur.execute("select * from copied;")
sqlite3.OperationalError: Could not decode to UTF-8 column 'LUTFullString' with 
text ','


Which I suppose it just as likely to be my own problem though.


-Original Message-
From: sqlite-users  On Behalf Of 
Ryan Mack
Sent: Friday, January 10, 2020 8:48 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Unexplained table bloat

Hi list,

I'm trying to understand unexplained table bloat I found in what should be
a very small table in an old database file. If you dump/restore the
database, the problem goes away. If you duplicate the table, the problem
propagates. Schema:

CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

Test Data:
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

In my test database file, each of these 3 rows is allocating about 40 4k
overflow pages for a total database size of about 500k. The full database
has about 4MB of actual data which takes up over 500MB on disk. If you want
to see/reproduce the problem you'll need my test database file which I've
uploaded here: https://mackman.net/200k-per-row.sqlite.zip (500kb
download). I don't know why it doesn't compress better, those extra
overflow pages must be getting filled with random garbage.

My uninformed guess is there was a bug in the version of sqlite used at
database creation time that computed an incorrect overflow threshold and is
storing each byte of the row to its own page. Since the problem goes away
with a dump/restore, I'm considering releasing a script to do that and
mitigate the problem for affected users. Before doing that I would like to
understand the problem better.

Thanks for reading, Ryan

PS: Here's some output from my debug session showing the 123 bytes of data
is occupying 582k of space on disk in a freshly created table.

% sqlite3 200k-per-row.sqlite
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> .tables
copied

sqlite> .schema copied
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

sqlite> select sum(length(id_local) + length(lutfullstring) +
length(luthash)) from copied;
123

sqlite> create table copied2 as select * from copied;
sqlite> SELECT name, SUM("pgsize") as sz FROM dbstat group by name;
copied|581632
copied2|581632
sqlite_master|4096
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Unexplained table bloat

2020-01-10 Thread Ryan Mack
Hi list,

I'm trying to understand unexplained table bloat I found in what should be
a very small table in an old database file. If you dump/restore the
database, the problem goes away. If you duplicate the table, the problem
propagates. Schema:

CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

Test Data:
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

In my test database file, each of these 3 rows is allocating about 40 4k
overflow pages for a total database size of about 500k. The full database
has about 4MB of actual data which takes up over 500MB on disk. If you want
to see/reproduce the problem you'll need my test database file which I've
uploaded here: https://mackman.net/200k-per-row.sqlite.zip (500kb
download). I don't know why it doesn't compress better, those extra
overflow pages must be getting filled with random garbage.

My uninformed guess is there was a bug in the version of sqlite used at
database creation time that computed an incorrect overflow threshold and is
storing each byte of the row to its own page. Since the problem goes away
with a dump/restore, I'm considering releasing a script to do that and
mitigate the problem for affected users. Before doing that I would like to
understand the problem better.

Thanks for reading, Ryan

PS: Here's some output from my debug session showing the 123 bytes of data
is occupying 582k of space on disk in a freshly created table.

% sqlite3 200k-per-row.sqlite
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> .tables
copied

sqlite> .schema copied
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

sqlite> select sum(length(id_local) + length(lutfullstring) +
length(luthash)) from copied;
123

sqlite> create table copied2 as select * from copied;
sqlite> SELECT name, SUM("pgsize") as sz FROM dbstat group by name;
copied|581632
copied2|581632
sqlite_master|4096
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to get rowid for select query?

2020-01-10 Thread Andy
previous:
  rescells.Clear;
  rc := sqlite3_get_table(db, PChar(selectquery), @selres, @nRow, @nColumn,
@zErrmsg);
  Result := rc = SQLITE_OK;
  for i := 0 to nRow * nColumn - 1 do
  begin
rescells.Add(selres[i]);
  end;
  if selres <> nil then
sqlite3_free_table(selres);

current, working
  rescells.Clear;
  rc := sqlite3_get_table(db, PChar(selectquery), @selres, @nRow, @nColumn,
@zErrmsg);
  Result := rc = SQLITE_OK;
  for i := 0 to (nRow+1) * nColumn - 1 do <-- nRow+1  - it is works
  begin
rescells.Add(selres[i]);
  end;
  if selres <> nil then
sqlite3_free_table(selres);



pt., 10 sty 2020 o 14:01 Clemens Ladisch  napisał(a):

> Andy wrote:
> > I try "select rowid, field1,field2 from table" but first value was not
> > number rowid but literary string "rowid".
>
> Please show the actual code (not SQL, but your program) that you're
> executing.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to get rowid for select query?

2020-01-10 Thread Andy
OK, first row is always header, next rows are values of query,

pt., 10 sty 2020 o 13:56 Andy  napisał(a):

> I don't know why sqlite3_get_table
> for 'SELECT rowid, id, Bytes, BytesCompr, flags FROM Articles where id=5
> limit 1'
> give me column names:
> rowid id Bytes BytesCompr flags
> instead od values of this columns
>
> pt., 10 sty 2020 o 13:36 Andy  napisał(a):
>
>> I successfully insert blob. I use sqlite3_last_insert_rowid afer calling
>> sqlite3_step.
>> But how to retrieve blob?
>> I try "select rowid, field1,field2 from table" but first value was not
>> number rowid but literary string "rowid".
>> I can get blob if I know rowid.
>>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to get rowid for select query?

2020-01-10 Thread Clemens Ladisch
Andy wrote:
> I try "select rowid, field1,field2 from table" but first value was not
> number rowid but literary string "rowid".

Please show the actual code (not SQL, but your program) that you're executing.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to get rowid for select query?

2020-01-10 Thread Andy
I don't know why sqlite3_get_table
for 'SELECT rowid, id, Bytes, BytesCompr, flags FROM Articles where id=5
limit 1'
give me column names:
rowid id Bytes BytesCompr flags
instead od values of this columns

pt., 10 sty 2020 o 13:36 Andy  napisał(a):

> I successfully insert blob. I use sqlite3_last_insert_rowid afer calling
> sqlite3_step.
> But how to retrieve blob?
> I try "select rowid, field1,field2 from table" but first value was not
> number rowid but literary string "rowid".
> I can get blob if I know rowid.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to get rowid for select query?

2020-01-10 Thread Andy
I successfully insert blob. I use sqlite3_last_insert_rowid afer calling
sqlite3_step.
But how to retrieve blob?
I try "select rowid, field1,field2 from table" but first value was not
number rowid but literary string "rowid".
I can get blob if I know rowid.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users