Re: [sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Andreas Kupries
> On 11/10/19, Richard Hipp  wrote:
> > It seems there are also problems in generating valid JSON for large
> > unicode characters.  For example:
> >
> > SELECT json_quote(char(1114111));
> >
> > I'm working on the problem
 
> I take that back.  It seems that the spec allow arbitrary UTF-8
> characters in a JSON string, without the need to escape them using
> the backslash notation, except for ", \, and control characters, and
> so the current behavior is correct after all, as far as I can tell.
> Please let me know if you see any cases that I have mised that do
> not work.

It seems that it is time to remind everybody about

https://github.com/nst/JSONTestSuite

and the adjunct article

http://seriot.ch/parsing_json.php

Actually, the github repository is considered to be an appendix to the
article.

Might be interesting to submit the sqlite json parser to this
testsuite for comparison.

-- 
See you,
Andreas Kupries 

Developer @ SUSE Software Canada ULC


Tcl'2019, Nov 5-8, Houston, TX, USA. http://www.tcl.tk/community/tcl2019/
---




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Richard Hipp
I now believe that check-in
https://www.sqlite.org/src/timeline?c=51027f08c0478f1b is the complete
and correct fix for the reported problem with UTF16 surrogate pairs in
JSON.  I have checked in new test cases (to TH3, where, unfortunately,
you cannot see them).  If anybody sees anything that I have
overlooked, please post a follow-up comment.

Thanks again to Serhiy for bringing the problem to my attention by
posting on the mailing list.

Serhiy:  In case it helps:  You can go to the check-in info page
(https://www.sqlite.org/src/info/51027f08c0478f1b) and click on the
"tarball" link to get a source code tarball, then build the
amalgamation for testing using:

./configure && make sqlite3.c

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Richard Hipp
On 11/10/19, Richard Hipp  wrote:
> It seems there are also problems in generating valid JSON for large
> unicode characters.  For example:
>
> SELECT json_quote(char(1114111));
>
> I'm working on the problem

I take that back.  It seems that the spec allow arbitrary UTF-8
characters in a JSON string, without the need to escape them using the
backslash notation, except for ", \, and control characters, and so
the current behavior is correct after all, as far as I can tell.
Please let me know if you see any cases that I have mised that do not
work.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Richard Hipp
It seems there are also problems in generating valid JSON for large
unicode characters.  For example:

SELECT json_quote(char(1114111));

I'm working on the problem
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Richard Hipp
On 11/8/19, Serhiy Storchaka  wrote:
> JSON_EXTRACT [1] produces gibberish for encoded non-BMP characters.

Thanks for the bug report and test case.

I checked in candidate fix here:
https://www.sqlite.org/src/timeline?c=51027f08c0478f1b

I need to do additional testing, and due to conflicting obligations,
I'm not sure when I will have a chance to do that.  So the issue
should remain open for now.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Jens Alfke

> On Nov 10, 2019, at 4:03 AM, Richard Damon  wrote:
> 
> Actually, 'The Artist whose name formerly was Prince' (which wasn't his
> name, his legal name was an unpronounceable pictograph), breaks every
> computer system I know.

Unicode Character PRINCE (U+1F934)
https://www.fileformat.info/info/unicode/char/1f934/index.htm

Oh wait, wrong Prince...

There’s always this:
https://parkerhiggins.net/2013/01/writing-the-prince-symbol-in-unicode/

But all kidding aside, databases are created to serve people, not the other way 
around. To declare that a database schema is the truth is absurd.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Kevin O'Gorman
And "full legal name"   How about my dad, whose full name was Dr. John
Michael Patrick Dennis Emmet O'Gorman, PhD.  How many rules does that
break?  I've fought many companies over that apostrophe in my life.
Governments tend to throw it away, but it's on my old passport and birth
certificate.

---
Dictionary.com's word of the year: *misinformation*
Merriam-Webster word of the year: *justice*


On Sun, Nov 10, 2019 at 4:01 AM Richard Damon 
wrote:

> On 11/10/19 1:21 AM, Gary R. Schmidt wrote:
> > On 10/11/2019 13:44, Doug wrote:
> >> Au Contraire, Jens! In many local contexts you can normalize people's
> >> names. I was born in Kansas, USA. My parents filled out a birth
> >> certificate for me. It had a place on the form for first name, middle
> >> name, last name, and a suffix like II or III.
> >>
> >> That birth certificate form determined that everyone born in Kansas
> >> (at that time), had a first, middle, and last name. There was no
> >> discussion of the matter. That's the way it was. The form led the
> >> way; people never thought about whether it was effective or not. Each
> >> newly-born child was given a first, middle, and last name.
> >>
> >> Effective was irrelevant for that system. There was no option, no
> >> alternative. It simply was.
> >>
> >> All systems are like that at each moment in time. They are what they
> >> are at any moment in time, and they force the users to behave the way
> >> the system wants them to behave. If you want to change the system and
> >> momentum is on your side, then immediately you have a new system - at
> >> that moment in time. It is composed of the old system and the momentum.
> >>
> >> Back to names: just like the birth certificate, a system which
> >> assigns a name to you, actually coerces you to have that name,
> >> because within that system, you exist as that name. The "names"
> >> article is totally wrong when it says that each assumption is wrong.
> >> Each of those assumptions is correct, and I can find at least one
> >> system which makes each one correct. Within each system, the
> >> assumption works, and is valid.
> >>
> >> My two cents...
> > Is not worth the paper it is written on!
> >
> > So what happens when someone from a family who only uses first- and
> > last-names moves to Kansas?
> >
> > Do they have to make up a middle-name so that he idiots can fill out
> > the forms?
> >
> > Well, in the case of the US Navy back in the late 1980's, when a
> > friend of mine from here in Australia, who only has a first and
> > last-name married a USN pilot and moved to the USA, she was told that,
> > "Yes, you have a middle name."  No amount of arguing, or producing of
> > official documents, (well, it's the USA, most people there don't know
> > what a passport is), could prevail.  In the end she conceded defeat
> > and became  Doe , for the duration.
> >
> > Names are impossible, unless you use a free-form, infinite-length
> > field, you won't be safe, and even then, someone with turn up whose
> > name is 'n' recurring to an infinite number of characters or something!
> >
> > Cheers,
> > GaryB-)
> Actually, 'The Artist whose name formerly was Prince' (which wasn't his
> name, his legal name was an unpronounceable pictograph), breaks every
> computer system I know.
>
> --
> Richard Damon
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite with branching

2019-11-10 Thread Robert M. Münch
On 4 Nov 2019, at 22:25, Jens Alfke wrote:

>> On Nov 4, 2019, at 4:57 AM, Simon Slavin  wrote:
>>
>> That's one of the reasons that the source code for SQLite is public: so that 
>> people can add the features they want.
>
> Totally agree. However, when you go off the mainline of SQLite you lose some 
> things, like easy updating to new SQLite releases — you now have to deal with 
> merging the new official SQLite into the forked SQLite, or waiting for the 
> fork maintainer to do it.

For such fundamental and big changes, that’s exactly the problem.

For the mainline SQLite we know that the code-base is regularly maintained and 
updated and that the official extensions work. Hence, I can base a product on 
it.

For such a big new feature/method, which would be at the core of my product, I 
either have to maintain it myself (heavy) or hope that it’s not a dead-end… the 
risk profile is just not good.

Viele Grüsse.

-- 

Robert M. Münch


signature.asc
Description: OpenPGP digital signature
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Richard Damon
On 11/10/19 1:21 AM, Gary R. Schmidt wrote:
> On 10/11/2019 13:44, Doug wrote:
>> Au Contraire, Jens! In many local contexts you can normalize people's
>> names. I was born in Kansas, USA. My parents filled out a birth
>> certificate for me. It had a place on the form for first name, middle
>> name, last name, and a suffix like II or III.
>>
>> That birth certificate form determined that everyone born in Kansas
>> (at that time), had a first, middle, and last name. There was no
>> discussion of the matter. That's the way it was. The form led the
>> way; people never thought about whether it was effective or not. Each
>> newly-born child was given a first, middle, and last name.
>>
>> Effective was irrelevant for that system. There was no option, no
>> alternative. It simply was.
>>
>> All systems are like that at each moment in time. They are what they
>> are at any moment in time, and they force the users to behave the way
>> the system wants them to behave. If you want to change the system and
>> momentum is on your side, then immediately you have a new system - at
>> that moment in time. It is composed of the old system and the momentum.
>>
>> Back to names: just like the birth certificate, a system which
>> assigns a name to you, actually coerces you to have that name,
>> because within that system, you exist as that name. The "names"
>> article is totally wrong when it says that each assumption is wrong.
>> Each of those assumptions is correct, and I can find at least one
>> system which makes each one correct. Within each system, the
>> assumption works, and is valid.
>>
>> My two cents...
> Is not worth the paper it is written on!
>
> So what happens when someone from a family who only uses first- and
> last-names moves to Kansas?
>
> Do they have to make up a middle-name so that he idiots can fill out
> the forms?
>
> Well, in the case of the US Navy back in the late 1980's, when a
> friend of mine from here in Australia, who only has a first and
> last-name married a USN pilot and moved to the USA, she was told that,
> "Yes, you have a middle name."  No amount of arguing, or producing of
> official documents, (well, it's the USA, most people there don't know
> what a passport is), could prevail.  In the end she conceded defeat
> and became  Doe , for the duration.
>
> Names are impossible, unless you use a free-form, infinite-length
> field, you won't be safe, and even then, someone with turn up whose
> name is 'n' recurring to an infinite number of characters or something!
>
> Cheers,
>     Gary    B-) 
Actually, 'The Artist whose name formerly was Prince' (which wasn't his
name, his legal name was an unpronounceable pictograph), breaks every
computer system I know.

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Simon Slavin
On 10 Nov 2019, at 6:21am, Gary R. Schmidt  wrote:

> So what happens when someone from a family who only uses first- and 
> last-names moves to Kansas?

In my time with databases, I encountered several USAsians with a middle name of 
'Nmn'.  I know many USAsian people but nobody with this name.  It puzzled me 
until, months later, I was reading a database which preserved case, and 
encountered it as 'NMN'.  From this I figured out it signified "No Middle Name'.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] multi-threaded db access crash or deadlock using C API

2019-11-10 Thread Simon Slavin
On 8 Nov 2019, at 10:02pm, MA LIG  wrote:

> I've tried other permutations of various flags that go along for
> sqlite3_open_v2 , as well as setting pragma read_uncommitted, but can't
> seem to get multi-threaded concurrent rapid-fire querying to work at all.

Please set a timeout of at least 5000 milliseconds for every connection which 
opens the file:



Find out whether this prevents your deadlocks/crashes.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Gary R. Schmidt

On 10/11/2019 13:44, Doug wrote:

Au Contraire, Jens! In many local contexts you can normalize people's names. I 
was born in Kansas, USA. My parents filled out a birth certificate for me. It 
had a place on the form for first name, middle name, last name, and a suffix 
like II or III.

That birth certificate form determined that everyone born in Kansas (at that 
time), had a first, middle, and last name. There was no discussion of the 
matter. That's the way it was. The form led the way; people never thought about 
whether it was effective or not. Each newly-born child was given a first, 
middle, and last name.

Effective was irrelevant for that system. There was no option, no alternative. 
It simply was.

All systems are like that at each moment in time. They are what they are at any 
moment in time, and they force the users to behave the way the system wants 
them to behave. If you want to change the system and momentum is on your side, 
then immediately you have a new system - at that moment in time. It is composed 
of the old system and the momentum.

Back to names: just like the birth certificate, a system which assigns a name to you, 
actually coerces you to have that name, because within that system, you exist as that 
name. The "names" article is totally wrong when it says that each assumption is 
wrong. Each of those assumptions is correct, and I can find at least one system which 
makes each one correct. Within each system, the assumption works, and is valid.

My two cents...

Is not worth the paper it is written on!

So what happens when someone from a family who only uses first- and 
last-names moves to Kansas?


Do they have to make up a middle-name so that he idiots can fill out the 
forms?


Well, in the case of the US Navy back in the late 1980's, when a friend 
of mine from here in Australia, who only has a first and last-name 
married a USN pilot and moved to the USA, she was told that, "Yes, you 
have a middle name."  No amount of arguing, or producing of official 
documents, (well, it's the USA, most people there don't know what a 
passport is), could prevail.  In the end she conceded defeat and became 
 Doe , for the duration.


Names are impossible, unless you use a free-form, infinite-length field, 
you won't be safe, and even then, someone with turn up whose name is 'n' 
recurring to an infinite number of characters or something!


Cheers,
GaryB-)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] JSON_EXTRACT does not work with non-BMP characters

2019-11-10 Thread Serhiy Storchaka

JSON_EXTRACT [1] produces gibberish for encoded non-BMP characters. Example:

    sqlite> SELECT JSON_EXTRACT('"t\u00e8\u015b\ud835\udd99"', '$');
    tèś��

Expected result is "tèś햙".

Since character "햙" (U+1D599) is not in the Basic Multilingual Plane, 
it is encoded as the UTF-16 surrogate pair "\ud835\udd99" in JSON. [2] 
Seems JSON_EXTRACT decodes surrogate characters independently and do not 
combine then into a non-BMP character like in CESU-8 [3] or Modified 
UTF-8 [4].


It works correctly if the JSON contains not-escaped characters:

    sqlite> select JSON_EXTRACT('"tèś햙"', '$');
    tèś햙


The bug was originally reported on the Python bug tracker. [5]

If SQLite officially uses CESU-8 or Modified UTF-8 instead of the 
standard UTF-8, this should be documented. Python would use special 
workarounds for this. But it would be better to use the standard UTF-8.



[1] https://www.sqlite.org/json1.html#jex

[2] https://tools.ietf.org/html/rfc8259#section-7

[3] https://en.wikipedia.org/wiki/CESU-8

[4] https://en.wikipedia.org/wiki/UTF-8#Modified_UTF-8

[5] https://bugs.python.org/issue38749

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] multi-threaded db access crash or deadlock using C API

2019-11-10 Thread MA LIG
Hi,

Am using the API and having trouble (deadlock or crash) with multithreaded
access to a read-only sqlite3 file. Any help would be much appreciated.

   - Running OSX with sqlite3 3.28.0 amalgamation compiled from code. Same
   issue is occurring when the code is compiled for Windows using mingw64
   - I have one sqlite3 db file holding one table of data of about 100k
   rows and half a dozen columns with short-text or numeric data
   - I have one process with two threads concurrently accessing the table:
  - each thread has its own db connection
  - each thread is running an identical "select x" query, many
  thousands of time in rapid succession, with an indexed lookup, and only
  retrieving 1 row (at most) of data
   - I'm encountering a race condition and/or crash and have tried multiple
   different ways to address
   - Here are two approaches that are both failing:
  - deadlock: both processes open db (once each before quering begins)
  using SQLITE_OPEN_READONLY | SQLITE_OPEN_NOMUTEX |
SQLITE_OPEN_SHAREDCACHE.
  The deadlock occurs in the btreeCursor function, which both threads
  get stuck in (which in my version 3.28.0 amalgamation is on
  lines 67576-67581)

  for(pX=pBt->pCursor; pX; pX=pX->pNext){

if( pX->pgnoRoot==(Pgno)iTable ){

  pX->curFlags |= BTCF_Multiple;

  pCur->curFlags |= BTCF_Multiple;

}

  }


   - crash: both processes open db (once each before quering begins)
  using SQLITE_OPEN_READONLY | SQLITE_OPEN_URI, with uri parameter
  immutable=1. The crash occurs in sqlite3DbMallocRawNN, with a bad
  memory access exception using lldb

I've tried other permutations of various flags that go along for
sqlite3_open_v2 , as well as setting pragma read_uncommitted, but can't
seem to get multi-threaded concurrent rapid-fire querying to work at all.
Am open to any changes to how the db is opened or sqlite3_step is called
etc.

Thoughts or suggestions? Seems like this should be a common use case so my
guess is the solution is easy and I've just got something wrong...

Thank you
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users