Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-14 Thread Detlef Golze
I want to second that. This leads to the situation that data is accepted by the 
database but there is no way to read that data back or more precisely I get the 
wrong (i.e. different) data back. I didn't check the suggested patch, but I 
don't believe it will work in all cases. I'd rather prefer rejecting such 
strings or implicitly  convert them to a BLOB which at least provides a way to 
get the data back.

Thanks,
Detlef.

-Ursprüngliche Nachricht-
Von: sqlite-users  Im Auftrag von 
Dennis Snell
Gesendet: Montag, 13. Januar 2020 21:57
An: sqlite-users@mailinglists.sqlite.org
Betreff: Re: [sqlite] Feature request: more robust handling of invalid UTF-16 
data

I’d like to raise this issue again and give my support for what Maks Verver 
recommended in 
https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg110107.html


Independently I came to this bug while working on an issue in Simplenote’s 
Android app where our data was being corrupted when saved to sqlite inside the 
Android SDK. We received some invalid UTF-16 sequences and instead of rejecting 
them or decoding it properly sqlite is further mangling them and introducing 
more corruption.


Example:
We have a JSON document like this which we store in a table.


    {“content”: “\ud83c\udd70\ud83c(null)\udd71”,”tags":[]}


The JSON is well-formed but the sequence of UTF-16 code points is invalid. We 
have fixed our side of the equation which prevents creating this content, but 
we still receive from time to time the invalid sequence from older client 
libraries.


When sqlite reads this data two types of further corruption occur: reading 
beyond a code unit subsequence; and conflating high and low surrogates.


Reading beyond a code unit subsequence:


When the `TERM` was introduced[1] and updated[2] it appears to have been 
designed to assume that a string ends mid-surrogate but it does not attempt to 
address unpaired surrogates in the middle of an input text. In our case the 
`READ_UTF16BE` macro accepts the second `\ud83c` code unit and then consumes 
the following `\u0028` which is the separate and well-formed “(“. In turn this 
produces the more corrupted value of `\ud83c\udc28`, code point U+1F028, plus 
“null)” without the leading “(“.


Conflating high and low surrogates:


The `READ_UTF16__` macros both attempt to start processing surrogate pairs 
based on the `0xD800 <= c <= 0xE000` value of the input code unit. Because of 
this they will pick up on unpaired low surrogates, consume the next character, 
and then create a more corrupted Unicode string.


In our case, once we reach the `\udd71` the macro consumes the following quote, 
which in the JSON document closes the string, and puts them together as 
`\udd71\u0022` producing the invalid code point U+6C422. Moreover, because it 
consumed the string-ending quote it also corrupted the entire JSON document, as 
the new output resembles the following:


    {“content”: “\ud83c\udd70\ud83c\udc28ull)\ud971\udc22,”tags”:[]}


That is, we write this invalid Unicode sequence but valid JSON document into 
sqlite and read back an invalid Unicode sequence _and_ invalid JSON (see the 
missing quote before “tags”).


Supporting Unicode spec:


The Unicode specification[3] sections 3.2 and 3.9 speak to this situation and 
provides a specific comparable example:


    When a process interprets a code unit sequence which purports to be in a 
Unicode
    character encoding form, it shall treat ill-formed code unit sequences as 
an error
    condition and shall not interpret such sequences as characters.


    Furthermore, such a process must not treat any adjacent well-formed code 
unit
    sequences as being part of those ill-formed code unit sequences.


    For example, with the input UTF-8 code unit sequence , such a 
UTF-8
    conversion process must not return  or , because
    either of those outputs would be the result of interpreting a well-formed 
subsequence
    as being part of the ill-formed subsequence. The expected return value for 
such a
    process would instead be .


Supporing Maks’ suggestion to use the replacement character on error section 
23.8[4] provides the guidance:


    It [U+FFFD] can be substituted for any “unknown” character in another 
encoding that
    cannot be mapped in terms of known Unicode characters. It can also be used 
as one
    means of indicating a conversion error, when encountering an ill-formed 
sequence in
    a conversion between Unicode encoding forms.


Patching:


The `READ_UTF16__` macros thus should do not only what Maks proposed, which is 
to verify that the character following a surrogate half is also a surrogate 
half, but also to verify that we don’t start interpreting a surrogate sequence 
when encountering an unpaired low surrogate. I propose this change instead:


    #define READ_UTF16LE(zIn, TERM, c){
        c = (*zIn++);
        c += ((*zIn++)<<8);
        if( c>=0xDC00 && c<=0xE000 && TERM ) {
            c = 

Re: [sqlite] sqlite3_get_autocommit() changes value when sqlite3_step() returns SQLITE_BUSY

2018-09-13 Thread Detlef Golze
This also fixes the problem I mentioned here:
https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg108340.html

(BEGIN IMMEDIATE fails with "cannot start a transaction within a transaction 
(1)" without nested transsactions)

Thanks,
Detlef.

-Ursprüngliche Nachricht-
Von: sqlite-users  Im Auftrag von 
Dan Kennedy
Gesendet: Donnerstag, 13. September 2018 17:30
An: sqlite-users@mailinglists.sqlite.org
Betreff: Re: [sqlite] sqlite3_get_autocommit() changes value when 
sqlite3_step() returns SQLITE_BUSY

On 09/12/2018 08:56 PM, Nic Ramage wrote:
> Hi,
>
> I think I have found a regression/bug in Sqlite with respect to
> the sqlite3_get_autocommit() function.



Hi,

Thanks for putting in the work to create the bug demo. Bisecting the 
latest SQLite commits using it shows that this bug was fixed here:

   https://www.sqlite.org/src/info/e6108047cb136119

Fix will appear in 3.25.0 (currently undergoing release testing).

Dan.





>
> Sometimes, when multiple connections are used in multiple threads,
> sqlite3_get_autocommit()  reports that the connection has a transaction
> open (i.e. not in auto commit mode), even though sqlite3_step() returns
> SQLITE_BUSY.  I have verified that sqlite3_get_autocommit()  reports that
> there is no transaction open before sqlite3_step() is called.  The problem
> seems to be specific to the WAL journal mode, as I haven't been able to
> reproduce it for
>
> Initially, the problem only showed up under Linux, but I have now been able
> to reproduce the problem on Windows too.  For me it shows up quicker under
> linux, but that may just be because of the difference in environments.  I
> have attached a single-file C++11 application that consistently reproduces
> the problem for me.
>
> I was also able to bisect the introduction of the problem to sqlite
> v3.17.0.  v3.16.2 does not have the problem.  The latest 3.24.0 release
> does.  I am not an expert on the sqlite source code, but during my
> debugging, I was able to determine that the auto commit flag was being
> cleared in "case OP_AutoCommit", which returns without error.  Obviously,
> something else must be causing the  SQLITE_BUSY afterwards, but is not
> resetting the auto commit flag.
>
> In my code, I have worked around the problem by issuing a "rollback"
> whenever I detect the problem described.
>
> I hope this is enough information to be able to solve the bug.  If,
> however, this behaviour turns out not to be a bug, but something that is
> expected to happen occasionally, then the "rollback" provides a reasonable
> way to mitigate the issue.
>
> Regards
> Nic
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] BEGIN IMMEDIATE fails with "cannot start a transaction within a transaction (1)" without nested transsactions

2018-01-22 Thread Detlef Golze
Hello,

we have a situation where "BEGIN IMMEDIATE" unexpectedly returns SQLITE_ERROR 
with message "cannot start a transaction within a transaction (1)".

We have multiple threads running which repeatedly do the following:


BEGIN IMMEDIATE
INSERT ...
INSERT ...
COMMIT


This means each thread is using its own connection so that no synchronization 
is required nor performed.

Sometimes "BEGIN IMMEDIATE" fails with SQLITE_ERROR instead of the expected 
SQLITE_BUSY.

The behavior is present since 3.17 up to latest 3.22. We did not see this with 
earlier versions.

We are running on Windows x64 and SQLite is built from amalgamation with 
SQLITE_THREADSAFE set to 2.

I am posting to get a confirmation that my understanding is correct and that 
"BEGIN IMMEDIATE" should not return this error in this situation. In that case 
we can also try to narrow down the change in 3.17 which causes the failure.

Thank you,
Detlef.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug? Incorrect use of SQLITE_DEFAULT_CACHE_SIZE

2017-02-20 Thread Detlef Golze
Hello Joe,

thank you for your attention, but your change does not fully address my issue.

This code still assumes that SQLITE_WIN32_CACHE_SIZE is measured in pages:

#ifndef SQLITE_WIN32_HEAP_INIT_SIZE
#  define SQLITE_WIN32_HEAP_INIT_SIZE   ((SQLITE_WIN32_CACHE_SIZE) * \
 (SQLITE_DEFAULT_PAGE_SIZE) + \
 (SQLITE_WIN32_HEAP_INIT_EXTRA))
#endif

But it is not:

#ifndef SQLITE_WIN32_CACHE_SIZE
#  if SQLITE_DEFAULT_CACHE_SIZE>=0
#define SQLITE_WIN32_CACHE_SIZE (SQLITE_DEFAULT_CACHE_SIZE)
#  else
#define SQLITE_WIN32_CACHE_SIZE (-(SQLITE_DEFAULT_CACHE_SIZE))
#  endif
#endif

If we go to the else branch, the SQLITE_WIN32_CACHE_SIZE will be KByte rather 
than pages. Your change may solve a potential overflow, but not the wrong 
calculation of SQLITE_WIN32_HEAP_INIT_SIZE if SQLITE_DEFAULT_PAGE_SIZE is 
specified as negative value (Kbyte).

Thanks,
Detlef.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Joe Mistachkin
Sent: Monday, February 20, 2017 8:25 PM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Bug? Incorrect use of SQLITE_DEFAULT_CACHE_SIZE


Detlef Golze wrote:
>
> Changes carried forward from version 3.12.0 (2016-03-29):
> 

Ah, right.  I was reading the comments in the source code.

I've checked-in some changes that should prevent integer
overflows when very large values are used for the
SQLITE_DEFAULT_CACHE_SIZE and/or SQLITE_DEFAULT_PAGE_SIZE
defines.

--
Joe Mistachkin @ https://urn.to/r/mistachkin

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug? Incorrect use of SQLITE_DEFAULT_CACHE_SIZE

2017-02-20 Thread Detlef Golze
> The SQLITE_DEFAULT_CACHE_SIZE value is always measured in pages.

http://sqlite.org/compile.html#default_cache_size
http://sqlite.org/pragma.html#pragma_cache_size

My interpretation is that the option to provide a negative value has been 
introduced to allow specification of cache size limits independent of used page 
size. And it has been used like that:

Changes carried forward from version 3.12.0 (2016-03-29):

Potentially Disruptive Change:
The SQLITE_DEFAULT_PAGE_SIZE is increased from 1024 to 4096. The 
SQLITE_DEFAULT_CACHE_SIZE is changed from 2000 to -2000 so the same amount of 
cache memory is used by default. See the application note on the version 3.12.0 
page size change for further information.


Am I missing something?

Best regards,
Detlef.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Joe Mistachkin
Sent: Monday, February 20, 2017 10:02 AM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Bug? Incorrect use of SQLITE_DEFAULT_CACHE_SIZE


Detlef Golze wrote:
>
> SQLITE_DEFAULT_CACHE_SIZE is defined as page size if positive or
> Kbyte if negative.
>

The SQLITE_DEFAULT_CACHE_SIZE value is always measured in pages.

>
> I came here because I got an integer overflow in the definition
> of SQLITE_WIN32_HEAP_INIT_SIZE.
>

What values are you using for the SQLITE_DEFAULT_CACHE_SIZE and
SQLITE_DEFAULT_PAGE_SIZE defines?

--
Joe Mistachkin @ https://urn.to/r/mistachkin

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug? Incorrect use of SQLITE_DEFAULT_CACHE_SIZE

2017-02-20 Thread Detlef Golze
Hi,

the following looks like an incorrect use of SQLITE_DEFAULT_CACHE_SIZE 
(amalgamation 3.17 starting at line 37963):

/*
 * This is cache size used in the calculation of the initial size of the
 * Win32-specific heap.  It cannot be negative.
 */
#ifndef SQLITE_WIN32_CACHE_SIZE
#  if SQLITE_DEFAULT_CACHE_SIZE>=0
#define SQLITE_WIN32_CACHE_SIZE (SQLITE_DEFAULT_CACHE_SIZE)
#  else
#define SQLITE_WIN32_CACHE_SIZE (-(SQLITE_DEFAULT_CACHE_SIZE))
#  endif
#endif

/*
 * The initial size of the Win32-specific heap.  This value may be zero.
 */
#ifndef SQLITE_WIN32_HEAP_INIT_SIZE
#  define SQLITE_WIN32_HEAP_INIT_SIZE ((SQLITE_WIN32_CACHE_SIZE) * \
   (SQLITE_DEFAULT_PAGE_SIZE) + 4194304)
#endif

SQLITE_DEFAULT_CACHE_SIZE is defined as page size if positive or Kbyte if 
negative. I came here because I got an integer overflow in the definition of 
SQLITE_WIN32_HEAP_INIT_SIZE. I did not look for further consequences.

Thank you,
Detlef.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_changes() using 64 bit counters?

2016-12-09 Thread Detlef Golze
The maximal number of rows is 1e+13 ().

I do have tables with more than 2^32 rows. Of course one should avoid making 
that large transactions, but it is possible.

Regards,
dg.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Donald Griggs
Sent: Thursday, December 08, 2016 7:00 PM
To: SQLite mailing list
Subject: Re: [sqlite] sqlite3_changes() using 64 bit counters?

Just curious -- some of your tables approach 2 billion (2 milliard) rows?
Or, more precisely, you operate on more than 2 billion rows in a single
transaction?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3_changes() using 64 bit counters?

2016-12-08 Thread Detlef Golze
Hi,

is there a plan for sqlite3_changes() and sqlite3_total_changes() returning 64 
bit values? SQLite supports far more rows than a signed 32 bit counter can 
store and I would feel more comfortable if I don't need to check for an 
overflow.

Thank you,
Detlef.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Partial Index and Query Planner

2016-08-05 Thread Detlef Golze
Hi,

suppose I have a partial index like that:

CREATE INDEX MyIndex1 ON MyTable ( Value1 ) WHERE Value1<>0;

As documented, the query planner only uses the index if the SELECT contains 
that exact same condition.

According to .eqp with 3.12.1 the following statement indeed uses the index:

SELECT Value1 FROM MyTable WHERE Value1<>0 AND Value1=7;
--EQP-- 0,0,0,SEARCH TABLE MyTable USING COVERING INDEX MyIndex1 (Value1=?)

Can I safely assume that this 'workaround' works with all future versions? If 
not, is where a better solution?

Thank you,
Detlef.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] INTEGRITY_CHECK consumes entire system memory

2016-05-19 Thread Detlef Golze
Hi,

please check if this helps:
http://serverfault.com/questions/325277/windows-server-2008-r2-metafile-ram-usage

The provided script (setfc.ps1) did help for us and thus we implemented the 
suggested SetSystemFileCacheSize calls in our application.

Regards,
dg.

-Original Message-
From: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Markus Ecker
Sent: Tuesday, May 17, 2016 3:56 PM
To: sqlite-users at mailinglists.sqlite.org
Subject: [sqlite] INTEGRITY_CHECK consumes entire system memory

Hello together!

I am struggling with a memory issue of SQLite and therefore searching for help. 
The database of our product has about 70GB of data. When I call the "PRAGMA 
integrity_check" method, the memory consumption of my system continuously 
increases until all of my 16GB of RAM are used.
The interesting thing is that the memory is not consumed by the processes 
working on the SQLite itself.  Rather it seems that the database file is mapped 
into memory as you can see from the output of RamMap.

https://drive.google.com/open?id=0BwnV5z14WWRSYjZZOVlTWHREd00
https://drive.google.com/open?id=0BwnV5z14WWRSQVpneTZ3Q3I3NzA

Interesting for me is that the system does not crash if the entire memory is 
used (and SQLite does not want to consume more memory).
Nonetheless, the big issue is that for new applications no additional memory is 
available (errors occur when starting new applications) and the system itself 
also gets really slow!!
To debug further into this issue I took the Northwind sample database and 
extended the Employees table by continuously duplicating the records. This 
should exclude any issues caused by my database setup (I am using with custom 
collations etc.).

Unfortunately the issue is still there.

Reading the documentation I ended up in using the following PRAGMA statements ? 
but without success.
PRAGMA cache_size = 12800;
PRAGMA temp_store = FILE;
PRAGMA journal_mode = OFF;
PRAGMA mmap_size=0;
PRAGMA integrity_check;

Can anyone please help me further with this problem.
My OS is Windows 7 64Bit.

I uploaded a compressed version of the database here (178MB):
https://drive.google.com/open?id=0BwnV5z14WWRSYUhja0JWWFkxY3c

Thanks!!!
Best Regards
Markus

___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] UPDATE/open/close blocked while executing SELECT

2016-04-18 Thread Detlef Golze
> Detlef Golze wrote:
> > I attach the database with the URI option cache=shared.
> 
> This makes all threads share a single connection.  Don't do it.
> 
> 
> Regards,
> Clemens

That was indeed the problem. During my initial tests this option provided the 
best performance. I probably need to re-think that.

Thanks,
Detlef.



[sqlite] UPDATE/open/close blocked while executing SELECT

2016-04-18 Thread Detlef Golze
Hello Olivier,

thank you for the hints. I think I do it correctly, I compiled with 
SQLITE_CONFIG_MULTITHREAD and attach the database with the URI option 
cache=shared. Also, each connection is used by one thread at a time. But 
another thread may reuse a connection - I hope that is allowed.

Are you also opening/closing database connections and prepare statements while 
another SELECT is running?

Thanks,
Detlef.

-Original Message-
From: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Olivier 
Mascia
Sent: Monday, April 18, 2016 2:47 PM
To: SQLite mailing list
Subject: Re: [sqlite] UPDATE/open/close blocked while executing SELECT


> Le 18 avr. 2016 ? 14:17, Detlef Golze  a ?crit :
> 
> Hi,
> 
> I am using SQLite C amalgamation version 3.08.11.01 compiled for Multi 
> Threading Mode and using WAL Journal Mode.
> 
> Sometimes I need to execute a SELECT statement which takes a very long time 
> to complete. I realized that during that time other functions are blocked for 
> a significant amount of time, eventually they get through after few minutes, 
> but then, another statement/function hangs. I have seen UPDATE statements 
> blocking and also functions like sqlite3_prepare16_v2(), open/close calls for 
> the same database.
> 
> It is usually blocking at btreeLockCarefully().
> 
> Is there something I can do to avoid such long starvations or is that by 
> design?
> 
> Thank you very much,
> Detlef.

I'm not an old-timer using SQLite and participating here, but I'm using it in 
WAL journal mode only, especially for its feature of 'multiple readers do not 
block writer and see a stable view of the db based on the instant they started 
their transaction'.

Just to clarify: you mean compiled using SQLITE_CONFIG_MULTITHREAD (and not 
SQLITE_CONFIG_SERIALIZED), is that it?

Then, if using threads in the application, each thread has its own connection 
(or multiple connections) but no thread share any connection with another one? 
(This is required for SQLITE_CONFIG_MULTITHREAD mode, else you would need 
SQLITE_CONFIG_SERIALIZED, and that would add a lot of contention).

Also, are your connections using private (SQLITE_OPEN_PRIVATECACHE) or shared 
cache (SQLITE_OPEN_SHAREDCACHE)?

When using SQLITE_CONFIG_MULTITHREAD, taking care no application thread share a 
connection with another, and using SQLITE_OPEN_PRIVATECACHE, along with WAL 
journal mode, I see no issue similar to what you report in our own programming.

-- 
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om


___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] UPDATE/open/close blocked while executing SELECT

2016-04-18 Thread Detlef Golze
Hi,

I am using SQLite C amalgamation version 3.08.11.01 compiled for Multi 
Threading Mode and using WAL Journal Mode.

Sometimes I need to execute a SELECT statement which takes a very long time to 
complete. I realized that during that time other functions are blocked for a 
significant amount of time, eventually they get through after few minutes, but 
then, another statement/function hangs. I have seen UPDATE statements blocking 
and also functions like sqlite3_prepare16_v2(), open/close calls for the same 
database.

It is usually blocking at btreeLockCarefully().

Is there something I can do to avoid such long starvations or is that by design?

Thank you very much,
Detlef.