Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread Keith Medcalf

And this version is several times faster since only the changes are union'd 
which minimizes the total number of records processed.
The index should be "create index i on t (a, idate);"  Because of the way 
indexes work, entries on the same a, idate will be ordered by n. (though really 
idate should be unique within a, so the index should be a unique index)

  select a, idate, variable, oldv, newv
from (
  select a,
 idate,
 'b' as variable,
 lag(b, 1) over (partition by a order by idate) as oldv,
 b as newv
from t
order by a, idate
 )
   where oldv <> newv
union all
  select a, idate, variable, oldv, newv
from (
  select a,
 idate,
 'c' as variable,
 lag(c, 1) over (partition by a order by idate) as oldv,
 c as newv
from t
order by a, idate
 )
   where oldv <> newv
union all
  select a, idate, variable, oldv, newv
from (
  select a,
 idate,
 'd' as variable,
 lag(d, 1) over (partition by a order by idate) as oldv,
 d as newv
from t
order by a, idate
 )
   where oldv <> newv
union all
  select a, idate, variable, oldv, newv
from (
  select a,
 idate,
 'e' as variable,
 lag(e, 1) over (partition by a order by idate) as oldv,
 e as newv
from t
order by a, idate
 )
   where oldv <> newv
order by a, idate, variable;

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Keith Medcalf
>Sent: Monday, 13 January, 2020 17:04
>To: SQLite mailing list 
>Subject: Re: [sqlite] Capturing the changes in columns in a table
>
>
>Note this only requires that "idate" be a unique orderable sequence
>within "a" in order to work.  It does not have to be particular (such as
>a date/datetime).
>It can be a date, a datetime, an integer (as in unixtime), a real (as in
>julianday number), or any old sequence number and it will still work
>exactly the same.
>
>Replace UNION with UNION ALL and add an ORDER BY clause at the end which
>may help performance depending on the amount of data.
>
>create unique index i on t (a, idate) to optimize retrieval.
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-Original Message-
>>From: sqlite-users  On
>>Behalf Of Keith Medcalf
>>Sent: Monday, 13 January, 2020 16:51
>>To: SQLite mailing list 
>>Subject: Re: [sqlite] Capturing the changes in columns in a table
>>
>>
>>create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
>>insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4,
>'2019-
>>02-11');
>>insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4,
>'2019-
>>02-11');
>>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4,
>'2019-
>>02-11');
>>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4,
>'2019-
>>02-11');
>>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4,
>'2019-
>>02-11');
>>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>>'2019-02-12');
>>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4,
>'2019-
>>02-12');
>>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4,
>'2019-
>>02-12');
>>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4,
>'2019-
>>02-12');
>>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4,
>'2019-
>>02-12');
>>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>>'2019-02-13');
>>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4,
>'2019-
>>02-13');
>>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4,
>'2019-
>>02-13');
>>insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4,
>'2019-
>>02-13');
>>insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8,
>'2019-
>>02-13');
>>
>>  select *
>>from (
>>  select a,
>> idate,
>> 'b' as variable,
>> lag(b, 1) over (partition by a order by idate) as oldv,
>> b as newv
>>from t
>>   union
>>  select a,
>> idate,
>> 'c' as variable,
>> lag(c, 1) over (partition by a order by idate) as oldv,
>> c as newv
>>from t
>>   union
>>  select a,
>> idate,
>> 'd' as variable,
>> lag(d, 1) over (partition by a order by idate) as oldv,
>> d as newv
>>from t
>>   union
>>  select a,
>> idate,
>> 

Re: [sqlite] Best way to store key,value pairs

2020-01-13 Thread Jens Alfke


> On Jan 13, 2020, at 1:45 PM, James K. Lowden  wrote:
> 
> So, basically, a nomalized design requires too much use of INSERT?  
> You're making an efficiency argument here, or maybe
> ease-of-implementation assertion. For me, inserting one header row or
> 20 is the same coding effort (still need a loop).

I was suggesting _zero_ header rows, i.e. store the headers as a blob column in 
the messages table, instead of having a separate table.

> I think transaction throughput would be about the same if COMMIT is applied 
> only to whole messages.  

There are other pain points. From what I've seen in the past, IIRC one of them 
is managing the foreign-key constraints (e.g. SQLite finding and deleting 
header rows when a message row is deleted), and the necessity of using JOINs to 
get at the headers.

Anyway: I merely said to "consider encoding the headers as JSON". Certainly 
didn't mean to imply it was the canonical answer.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug fixes only branch.

2020-01-13 Thread Richard Hipp
On 1/13/20, Syed Ahmad  wrote:
> We are at 3.14.2
>
> Current version = 3.14.2 Date : 2016-09-12
>
> https://www.sqlite.org/changes.html
>
> how can i take latest stable branch which include only bug fixes . no new
> features.
>
> Is there any way?

We sometimes do things like that for paid support customers.  But
maintaining bug-fix branches of historical versions is time-consuming,
so we do not do it routinely.  It is also risky, as actual releases
are better tested and more reliable than backported patches.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread R Smith
Jose, I like Keith's version better using the Windowing functions 
assuming your version of SQLite is newer than 3.27 (or whenever Window 
functions were introduced, again my memory fails...)


Most importantly, the CTE query /requires/ changes be day-on-day to be 
seen, which is the case in your example, but might not be the real life 
case, whereas this window-function query of Keith will spot changes even 
if updates are logged days apart, or indeed happened on the same day.


You still need to make sure of a few things, but I would definitely use 
this query.



On 2020/01/14 1:51 AM, Keith Medcalf wrote:

create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8, 
'2019-02-13');

   select *
 from (
   select a,
  idate,
  'b' as variable,
  lag(b, 1) over (partition by a order by idate) as oldv,
  b as newv
 from t
union
   select a,
  idate,
  'c' as variable,
  lag(c, 1) over (partition by a order by idate) as oldv,
  c as newv
 from t
union
   select a,
  idate,
  'd' as variable,
  lag(d, 1) over (partition by a order by idate) as oldv,
  d as newv
 from t
union
   select a,
  idate,
  'e' as variable,
  lag(e, 1) over (partition by a order by idate) as oldv,
  e as newv
 from t
  )
where oldv <> newv;

a   idate   variableoldvnewv
--  --  --  --  --
p0012019-02-12  b   1   10
p0022019-02-12  c   2   4
p0032019-02-12  d   n   y
p0042019-02-13  b   4   5
p0042019-02-13  c   2   3
p0042019-02-13  d   y   n
p0052019-02-13  c   2   3
p0052019-02-13  e   4   8

Change "<>" to "IS NOT" to find out when their was no previous value


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread Keith Medcalf

Note this only requires that "idate" be a unique orderable sequence within "a" 
in order to work.  It does not have to be particular (such as a date/datetime).
It can be a date, a datetime, an integer (as in unixtime), a real (as in 
julianday number), or any old sequence number and it will still work exactly 
the same.

Replace UNION with UNION ALL and add an ORDER BY clause at the end which may 
help performance depending on the amount of data.

create unique index i on t (a, idate) to optimize retrieval.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Keith Medcalf
>Sent: Monday, 13 January, 2020 16:51
>To: SQLite mailing list 
>Subject: Re: [sqlite] Capturing the changes in columns in a table
>
>
>create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
>insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, '2019-
>02-11');
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4, '2019-
>02-11');
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4, '2019-
>02-11');
>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, '2019-
>02-11');
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, '2019-
>02-11');
>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>'2019-02-12');
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, '2019-
>02-12');
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, '2019-
>02-12');
>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, '2019-
>02-12');
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, '2019-
>02-12');
>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>'2019-02-13');
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, '2019-
>02-13');
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, '2019-
>02-13');
>insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4, '2019-
>02-13');
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8, '2019-
>02-13');
>
>  select *
>from (
>  select a,
> idate,
> 'b' as variable,
> lag(b, 1) over (partition by a order by idate) as oldv,
> b as newv
>from t
>   union
>  select a,
> idate,
> 'c' as variable,
> lag(c, 1) over (partition by a order by idate) as oldv,
> c as newv
>from t
>   union
>  select a,
> idate,
> 'd' as variable,
> lag(d, 1) over (partition by a order by idate) as oldv,
> d as newv
>from t
>   union
>  select a,
> idate,
> 'e' as variable,
> lag(e, 1) over (partition by a order by idate) as oldv,
> e as newv
>from t
> )
>   where oldv <> newv;
>
>a   idate   variableoldvnewv
>--  --  --  --  --
>p0012019-02-12  b   1   10
>p0022019-02-12  c   2   4
>p0032019-02-12  d   n   y
>p0042019-02-13  b   4   5
>p0042019-02-13  c   2   3
>p0042019-02-13  d   y   n
>p0052019-02-13  c   2   3
>p0052019-02-13  e   4   8
>
>Change "<>" to "IS NOT" to find out when their was no previous value
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-Original Message-
>>From: sqlite-users  On
>>Behalf Of Jose Isaias Cabrera
>>Sent: Monday, 13 January, 2020 12:42
>>To: SQLite mailing list 
>>Subject: [sqlite] Capturing the changes in columns in a table
>>
>>
>>Greetings!
>>
>>Please observe the following,
>>
>>
>>create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
>>
>>insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4,
>'2019-
>>02-11');
>>
>>insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4,
>'2019-
>>02-11');
>>
>>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4,
>'2019-
>>02-11');
>>
>>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4,
>'2019-
>>02-11');
>>
>>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4,
>'2019-
>>02-11');
>>
>>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>>'2019-02-12');
>>
>>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4,
>'2019-
>>02-12');
>>
>>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4,
>'2019-
>>02-12');
>>
>>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4,
>'2019-
>>02-12');
>>
>>insert into 

Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread Keith Medcalf

create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-11');
insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-12');
insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4, 
'2019-02-13');
insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8, 
'2019-02-13');

  select *
from (
  select a,
 idate,
 'b' as variable,
 lag(b, 1) over (partition by a order by idate) as oldv,
 b as newv
from t
   union
  select a,
 idate,
 'c' as variable,
 lag(c, 1) over (partition by a order by idate) as oldv,
 c as newv
from t
   union
  select a,
 idate,
 'd' as variable,
 lag(d, 1) over (partition by a order by idate) as oldv,
 d as newv
from t
   union
  select a,
 idate,
 'e' as variable,
 lag(e, 1) over (partition by a order by idate) as oldv,
 e as newv
from t
 )
   where oldv <> newv;

a   idate   variableoldvnewv
--  --  --  --  --
p0012019-02-12  b   1   10
p0022019-02-12  c   2   4
p0032019-02-12  d   n   y
p0042019-02-13  b   4   5
p0042019-02-13  c   2   3
p0042019-02-13  d   y   n
p0052019-02-13  c   2   3
p0052019-02-13  e   4   8

Change "<>" to "IS NOT" to find out when their was no previous value

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Jose Isaias Cabrera
>Sent: Monday, 13 January, 2020 12:42
>To: SQLite mailing list 
>Subject: [sqlite] Capturing the changes in columns in a table
>
>
>Greetings!
>
>Please observe the following,
>
>
>create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);
>
>insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, '2019-
>02-11');
>
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4, '2019-
>02-11');
>
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4, '2019-
>02-11');
>
>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, '2019-
>02-11');
>
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, '2019-
>02-11');
>
>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>'2019-02-12');
>
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, '2019-
>02-12');
>
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, '2019-
>02-12');
>
>insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, '2019-
>02-12');
>
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, '2019-
>02-12');
>
>insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4,
>'2019-02-13');
>
>insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, '2019-
>02-13');
>
>insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, '2019-
>02-13');
>
>insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4, '2019-
>02-13');
>
>insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8, '2019-
>02-13');
>
>
>The SQL below provide info only for two dates (2019-02-11 and 2019-02-
>12):
>
>
>select new.a,old.b,new.b, 'column b changed on ' || new.idate as info
>from t as new
>
>  LEFT JOIN t as old ON
>
>new.idate = '2019-02-12'
>
>AND old.idate = '2019-02-11'
>
>AND new.a = old.a
>
>WHERE
>
>  new.b != old.b
>
>UNION ALL
>
>select new.a,old.c,new.c, 'column c changed on ' || new.idate as info
>from t as new
>
>  LEFT JOIN t as old ON
>
>new.idate = 

Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread R Smith


On 2020/01/14 1:11 AM, Jose Isaias Cabrera wrote:

R Smith, on Monday, January 13, 2020 05:25 PM, wrote...​



​
Wow!  Thanks for this.  I had not thought about your questions.  My boss said, 
I need to know all the changes per project whenever it happened. So,... I will 
have to revise my thinking, but I have enough with your help to continue. I am 
going to have to reload SQLitespeed, and try it again. :-)  Thanks.​


A - Es un placer,

B - It's important to really understand how they want to see changes. 
Also I'm simply assuming (thanks to your example) that changes do not 
happen more frequently than once a day, and that the time of it is not 
important. If it is, the query will need to be adjusted.


C - I know you probably know this, but just in case it isn't 100% clear: 
there is nothing about the SQL I posted that requires SQLitespeed. It is 
simply the easiest for me to use and it outputs SQL+Results the way I 
like it (so feel free), but that query will work in any SQLite platform 
for any version of SQLite - after 3.8 that is (or 3.7... or whatever 
version introduced CTE's, my memory is suddenly failing).



Good luck with your quest!
Ryan


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread Jose Isaias Cabrera

R Smith, on Monday, January 13, 2020 05:25 PM, wrote...​
> On 2020/01/13 9:42 PM, Jose Isaias Cabrera wrote:​
> > Greetings!​
> >​
> > Please observe the following,​
> >​
> >​
> > create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);​
> >​
> > insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, 
> > '2019-02-11');​
> >​
> > ...​
> >​
> > p001|1|10|column b changed on 2019-02-12​
> > p002|2|4|column c changed on 2019-02-12​
> > p003|n|y|column d changed on 2019-02-12​
> > sqlite>​
> >​
> > What I would like is to cover all of the dates in the same command, so that 
> > the output is this,​
> >​
> > p001|1|10|column b changed on 2019-02-12​
> > p002|2|4|column c changed on 2019-02-12​
> > p003|n|y|column d changed on 2019-02-12​
> > p004|4|5|column b changed on 2019-02-13​
> > p004|2|3|column c changed on 2019-02-13​
> > p004|y|n|column d changed on 2019-02-13​
> > p005|2|3|column c changed on 2019-02-13​
> > p005|4|8|column e changed on 2019-02-13​
> >​
> > Yes, I know I can get all the dates and build the SQL as I did above, but 
> > is there a way to do this in one call?  Thanks for the help.​
> ​
> ​
> Of course there is :)​
> ​
> Many questions are not clear though.​
> I assumed you are only interested in day-on-day changes, because in many ​
> days there can be many different values, if you measure them all against ​
> the last day, it will look like many changes between day X and the last ​
> day, when really you are probably interested only in the change from one ​
> day to the next day.​
> ​
> I also assumed that the days you will give as the input, let's call that ​
> min date and max date of the report, will refer to the days on which ​
> changed happened - which seems obvious, but the query must change if it ​
> is not the case.​
> ​
> Anyway, here is the SQL to achieve it. I made 2 queries, one in which it ​
> just gets the min and max from your data, and the other in which you can ​
> specify the min and max date. I guess the second one will be more what ​
> you want - the nice t hing is your calling program only needs to adjust ​
> those two values once, nowhere else:​
> ​
> (PS: I left the whole generated output in so you can see at the bottom, ​
> those errors are not errors, but just informing us that SQLite is ​
> creating indexes for those CTE tables, which is great and means this ​
> should be relatively fast  on large datasets even.)​
> ​
> ​
>-- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on SQLitespeed ​
> version 2.1.3.11.​
> ​
>-- Script Items: 4  Parameter Count: 0​
>-- 2020-01-14 00:08:17.875  |  [Info]   Script Initialized, ​
> Started executing...​
>-- ​
> ​
> ​
> create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);​
> ​
> insert into t (a, b, c, d, e, idate) values​
>   ('p001', 1, 2, 'n', 4, '2019-02-11')​
> ,('p002', 2, 2, 'n', 4, '2019-02-11')​
> ,('p003', 3, 2, 'n', 4, '2019-02-11')​
> ,('p004', 4, 2, 'y', 4, '2019-02-11')​
> ,('p005', 5, 2, 'y', 4, '2019-02-11')​
> ,('p001',10, 2, 'n', 4, '2019-02-12')​
> ,('p002', 2, 4, 'n', 4, '2019-02-12')​
> ,('p003', 3, 2, 'y', 4, '2019-02-12')​
> ,('p004', 4, 2, 'y', 4, '2019-02-12')​
> ,('p005', 5, 2, 'y', 4, '2019-02-12')​
> ,('p001',10, 2, 'n', 4, '2019-02-13')​
> ,('p002', 2, 4, 'n', 4, '2019-02-13')​
> ,('p003', 3, 2, 'y', 4, '2019-02-13')​
> ,('p004', 5, 3, 'n', 4, '2019-02-13')​
> ,('p005', 5, 3, 'y', 8, '2019-02-13')​
> ;​
> ​
> -- This version guesses the min and max dates...​
> WITH DSpan(minDate, maxDate) AS (​
>SELECT MIN(idate), MAX(idate) FROM t GROUP BY idate​
> ), DDays(dayOldDate, dayNewDate) AS (​
>SELECT minDate, date(minDate,'+1 day') FROM DSpan​
>UNION ALL​
>SELECT dayNewDate, date(dayNewDate,'+1 day') FROM DDays, DSpan WHERE ​
> DDays.dayNewDate, on ​
> ), Chg(a, idate, col, oldVal, newVal) AS (​
>SELECT DISTINCT tNew.a, tNew.idate, 'b', tOld.b,tNew.b​
>  FROM DDays​
>  JOIN t AS tOld ON tOld.idate = DDays.dayoldDate​
>  JOIN t AS tNew ON tNew.idate = DDays.dayNewDate​
> WHERE tNew.a = tOld.a AND tNew.b != tOld.b​
> UNION ALL​
>SELECT DISTINCT tNew.a, tNew.idate, 'c', tOld.c,tNew.c​
>  FROM DDays​
>  JOIN t AS tOld ON tOld.idate = DDays.dayoldDate​
>  JOIN t AS tNew ON tNew.idate = DDays.dayNewDate​
> WHERE tNew.a = tOld.a AND tNew.c != tOld.c​
> UNION ALL​
>SELECT DISTINCT tNew.a, tNew.idate, 'd', tOld.d,tNew.d​
>  FROM DDays​
>  JOIN t AS tOld ON tOld.idate = DDays.dayoldDate​
>  JOIN t AS tNew ON tNew.idate = DDays.dayNewDate​
> WHERE tNew.a = tOld.a AND tNew.d != tOld.d​
> UNION ALL​
>SELECT DISTINCT tNew.a, tNew.idate, 'e', tOld.e,tNew.e​
>  FROM DDays​
>  JOIN t AS tOld ON tOld.idate = DDays.dayoldDate​
>  JOIN t AS tNew ON tNew.idate = DDays.dayNewDate​
> WHERE tNew.a = tOld.a AND tNew.e != tOld.e​
> )​
> SELECT Chg.a, Chg.oldVal, Chg.newVal,​
>

Re: [sqlite] Bug fixes only branch.

2020-01-13 Thread Keith Medcalf

On Monday, 13 January, 2020 15:00, Donald Griggs  wrote:

>On Mon, Jan 13, 2020 at 11:34 AM Syed Ahmad  
>wrote:

>> We are at 3.14.2   Date : 2016-09-12

>> how can i take latest stable branch which include only bug fixes . no
>> new features.

>> Is there any way?

> I may well not be understanding properly, but what motivates you to ask
> for this?

I would suspect that the motivation is a periodic risk re-assessment policy 
that has been either badly written or is being badly interpreted in the belief 
that the addition of "new features" that are unused to a component that is 
subject to risk assessment requires an assessment of the risk associated with 
the unused "new features".  In other words, the risk assessment is based on the 
version of something rather than the utilized functionality of something.

This is quite common and in my previous job (before retirement) significant 
resources were spent on unnecessarily re-assessing things just because the 
version number changed (which often meant that things were not updated in order 
to prevent this expensive process), rather than simply reviewing the existing 
Risk Assessment and determining that nothing had changed, and that the addition 
of new unused "features" was immaterial to the overall assessment.

That is, someone had generated a Risk Assessment based (for example) on the use 
of SQLite version 3.14.2 and that the mere act of updating the version triggers 
the process for the re-evaluation of the Risk of the new version in toto, 
including the Risk associated with "features available" rather than "features 
used", when in fact the update of the version (and the addition of new unused 
and inaccessible features) was quite irrelevant.

A significant amount of effort was expended globally (probably on the order of 
hundreds of thousands of man-hours at not insignificant engineering cost per 
hour) to remove "version numbers" from Risk Assessments and to make sure that 
they were based on functionality used/exposed rather than the version number 
itself.

In this example, the difference is that someone believes that (for example) 
because the current version of SQLite supports CTE's and the old one didn't, 
requires an assessment of the risk associated with CTEs, even though the 
specific use being assessed does not and cannot use CTE's, thus triggering a 
full assessment of Risk (including the unused CTE feature) rather than merely a 
review to determine whether or not there been any significant change to the 
risk profile which would require re-assessment.

In other words, if the "old" version of something only supported "red" and 
"blue", and the system only used "red", and a subsequent version added "green" 
without affecting the functionality of "red" (and that "blue" and "green" are 
not used and cannot be accessed) then the mere fact of the addition of the 
feature "green" is irrelevant (until such time as the feature "green" is used, 
of course).  The fact that the new thing "green" is available is merely a 
quaint observation of zero significance if (a) it is not used and (b) cannot be 
meaningfully accessed, and its addition is not a significant change to the risk 
of that something.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Capturing the changes in columns in a table

2020-01-13 Thread R Smith


On 2020/01/13 9:42 PM, Jose Isaias Cabrera wrote:

Greetings!

Please observe the following,


create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);

insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, 
'2019-02-11');

...

p001|1|10|column b changed on 2019-02-12
p002|2|4|column c changed on 2019-02-12
p003|n|y|column d changed on 2019-02-12
sqlite>

What I would like is to cover all of the dates in the same command, so that the 
output is this,

p001|1|10|column b changed on 2019-02-12
p002|2|4|column c changed on 2019-02-12
p003|n|y|column d changed on 2019-02-12
p004|4|5|column b changed on 2019-02-13
p004|2|3|column c changed on 2019-02-13
p004|y|n|column d changed on 2019-02-13
p005|2|3|column c changed on 2019-02-13
p005|4|8|column e changed on 2019-02-13

Yes, I know I can get all the dates and build the SQL as I did above, but is 
there a way to do this in one call?  Thanks for the help.



Of course there is :)

Many questions are not clear though.
I assumed you are only interested in day-on-day changes, because in many 
days there can be many different values, if you measure them all against 
the last day, it will look like many changes between day X and the last 
day, when really you are probably interested only in the change from one 
day to the next day.


I also assumed that the days you will give as the input, let's call that 
min date and max date of the report, will refer to the days on which 
changed happened - which seems obvious, but the query must change if it 
is not the case.


Anyway, here is the SQL to achieve it. I made 2 queries, one in which it 
just gets the min and max from your data, and the other in which you can 
specify the min and max date. I guess the second one will be more what 
you want - the nice t hing is your calling program only needs to adjust 
those two values once, nowhere else:


(PS: I left the whole generated output in so you can see at the bottom, 
those errors are not errors, but just informing us that SQLite is 
creating indexes for those CTE tables, which is great and means this 
should be relatively fast  on large datasets even.)



  -- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on SQLitespeed 
version 2.1.3.11.


  -- Script Items: 4  Parameter Count: 0
  -- 2020-01-14 00:08:17.875  |  [Info]   Script Initialized, 
Started executing...
  -- 



create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);

insert into t (a, b, c, d, e, idate) values
 ('p001', 1, 2, 'n', 4, '2019-02-11')
,('p002', 2, 2, 'n', 4, '2019-02-11')
,('p003', 3, 2, 'n', 4, '2019-02-11')
,('p004', 4, 2, 'y', 4, '2019-02-11')
,('p005', 5, 2, 'y', 4, '2019-02-11')
,('p001',10, 2, 'n', 4, '2019-02-12')
,('p002', 2, 4, 'n', 4, '2019-02-12')
,('p003', 3, 2, 'y', 4, '2019-02-12')
,('p004', 4, 2, 'y', 4, '2019-02-12')
,('p005', 5, 2, 'y', 4, '2019-02-12')
,('p001',10, 2, 'n', 4, '2019-02-13')
,('p002', 2, 4, 'n', 4, '2019-02-13')
,('p003', 3, 2, 'y', 4, '2019-02-13')
,('p004', 5, 3, 'n', 4, '2019-02-13')
,('p005', 5, 3, 'y', 8, '2019-02-13')
;

-- This version guesses the min and max dates...
WITH DSpan(minDate, maxDate) AS (
  SELECT MIN(idate), MAX(idate) FROM t GROUP BY idate
), DDays(dayOldDate, dayNewDate) AS (
  SELECT minDate, date(minDate,'+1 day') FROM DSpan
  UNION ALL
  SELECT dayNewDate, date(dayNewDate,'+1 day') FROM DDays, DSpan WHERE 
DDays.dayNewDate < DSpan.maxDate

), Chg(a, idate, col, oldVal, newVal) AS (
  SELECT DISTINCT tNew.a, tNew.idate, 'b', tOld.b,tNew.b
    FROM DDays
    JOIN t AS tOld ON tOld.idate = DDays.dayoldDate
    JOIN t AS tNew ON tNew.idate = DDays.dayNewDate
   WHERE tNew.a = tOld.a AND tNew.b != tOld.b
UNION ALL
  SELECT DISTINCT tNew.a, tNew.idate, 'c', tOld.c,tNew.c
    FROM DDays
    JOIN t AS tOld ON tOld.idate = DDays.dayoldDate
    JOIN t AS tNew ON tNew.idate = DDays.dayNewDate
   WHERE tNew.a = tOld.a AND tNew.c != tOld.c
UNION ALL
  SELECT DISTINCT tNew.a, tNew.idate, 'd', tOld.d,tNew.d
    FROM DDays
    JOIN t AS tOld ON tOld.idate = DDays.dayoldDate
    JOIN t AS tNew ON tNew.idate = DDays.dayNewDate
   WHERE tNew.a = tOld.a AND tNew.d != tOld.d
UNION ALL
  SELECT DISTINCT tNew.a, tNew.idate, 'e', tOld.e,tNew.e
    FROM DDays
    JOIN t AS tOld ON tOld.idate = DDays.dayoldDate
    JOIN t AS tNew ON tNew.idate = DDays.dayNewDate
   WHERE tNew.a = tOld.a AND tNew.e != tOld.e
)
SELECT Chg.a, Chg.oldVal, Chg.newVal,
   ('Column '||Chg.col||' changed on '||Chg.idate||' from 
'||Chg.oldVal||' to '||Chg.newVal||'.') AS Change

  FROM Chg
 ORDER BY Chg.a, Chg.idate
;


  -- a    | oldVal | newVal | Change
  --  | -- | -- | 
  -- p001 |    1   |   10   | Column b changed on 2019-02-12 from 1 to 10.
  -- p002 |    2   |    4   | Column c changed on 2019-02-12 from 2 to 4.
  -- p003 |    n   |    y   | Column d changed on 2019-02-12 from n to y.
  

Re: [sqlite] Bug fixes only branch.

2020-01-13 Thread Donald Shepherd
On Tue, 14 Jan 2020 at 7:00 am, Donald Griggs  wrote:

> Hi, Syed,
>
> ===
> On Mon, Jan 13, 2020 at 11:34 AM Syed Ahmad 
> wrote:
>
> > We are at 3.14.2   Date : 2016-09-12
> >
> > how can i take latest stable branch which include only bug fixes . no new
> > features.
> >
> > Is there any way?
> > ==
>
>
> I may well not be understanding properly, but what motivates you to ask for
> this?
> Since the sqlite team spends so much effort to ensure backward
> compatibility, how bad would things be if you simply updated to the current
> stable release?
>
> The team does allow many features to be eliminated through conditional
> compilation if you are severely constrained in RAM.   Was RAM size the
> motivation?
>
> To provide versions which include only bug fixes from any arbitrary
> releasee, I should think the developers would, for every stable release,
> have to maintain a new fixes-only branch indefinitely -- and thus have to
> maintain dozens of branches.   Am I missing something?
>
> Kind regards,
>Donald
> ___


I can't speak to his exact scenario but having spent time in a very risk
averse work environment, I've experienced this kind of thinking.

The logic is almost always as a result of "we must have low bug counts
(true) so we need bug fixes (true) but new features introduce bugs (in
general true) therefore we don't want any new features".

In other words it's a result of the environment rather than a reflection of
SQLite.

Regards,
Donald Shepherd.

>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bug fixes only branch.

2020-01-13 Thread Donald Griggs
Hi, Syed,

===
On Mon, Jan 13, 2020 at 11:34 AM Syed Ahmad 
wrote:

> We are at 3.14.2   Date : 2016-09-12
>
> how can i take latest stable branch which include only bug fixes . no new
> features.
>
> Is there any way?
> ==


I may well not be understanding properly, but what motivates you to ask for
this?
Since the sqlite team spends so much effort to ensure backward
compatibility, how bad would things be if you simply updated to the current
stable release?

The team does allow many features to be eliminated through conditional
compilation if you are severely constrained in RAM.   Was RAM size the
motivation?

To provide versions which include only bug fixes from any arbitrary
releasee, I should think the developers would, for every stable release,
have to maintain a new fixes-only branch indefinitely -- and thus have to
maintain dozens of branches.   Am I missing something?

Kind regards,
   Donald
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread James K. Lowden
On Sun, 12 Jan 2020 16:24:53 -0700
"Keith Medcalf"  wrote:

> while one may be tempted to claim that "consistency is the hobgoblin
> of little minds" 

You might have forgotten that the much overpraised Ralph Waldo specified
"a foolish consistency".  He only meant: don't try to hang your hat on
too small a peg.  

> Doing this does not really do much since you still have to check the
> type on retrieval of the value anyway in order to know what to do
> with it.

That depends what "much" is.  

The value of constraints used to enforce types is to reject from the
database values outside the domain.  That not only simplifies
application logic, but also the logical consistency of the queries
themselves.  

If "year" is always an integer -- never NULL, never a string -- then 

avg(year) 
and
count(year) 

are always correct.  But if the database contains for "year" a string
like "it was a good one", or NULLs, they're both unreliable.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread James K. Lowden
On Sun, 12 Jan 2020 15:48:05 -0700
"Keith Medcalf"  wrote:

> SQL is simply a Structured Query Language that can be overlaid on a
> variety of datastore models, one of which is the Relational model.  

I think that's misleading, Keith, for a couple of reasons. 

SQL was invented by IBM expressly and specfically for its "relational"
product.  Its wordiness is the product of its roots in "4th generation"
languages in vogue at the time, the promise of which was to permit
users to "write their own reports".  Hence the strict select-from-where
syntax, meant to be so simple that untrained users could figure it
out.  

While its true, as you say, that many pre-relational systems (and
post-relational ones, later, feh) added SQL on their shingles, those
were never coherent implementations.  They were a way to say Yes! when
the question was, "Do you support SQL?"

> Many (most in fact) datastores require that all instances of the same
> "column" in an "entity" be the same type

To the extent that SQL implements relational algebra/calculus, its
utility and consistency *requires* that each column be of a defined
type.  For example, if the query includes, 

where A between 0 and 1

and A is a *not* a numeric value, then the query is reduced from
rigorous first-order predicate logic to gibberish.  

SQLite serves a particular niche very well.  The choice not to enforce
type constraints for declared datatypes biases the system toward
ease-of-insertion.  That has advantages in the SQLite problem domain,
but the trade-off comes at a price: it makes SQLite *harder* to use in a
rigorous way.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Best way to store key,value pairs

2020-01-13 Thread James K. Lowden
On Mon, 13 Jan 2020 10:37:57 -0800
Jens Alfke  wrote:

> > On Jan 12, 2020, at 4:12 PM, James K. Lowden
> >  wrote:
> > 
> > What is the motivation behind this advice?  It's completely
> > unnecessary.  
> 
> Thanks for your opinion, James! I disagree. 
> 
...
> IMHO there are too many of them to do the usual SQL thing
> and add every header of every message to a table ? that multiples the
> number of inserts and deletes by an order of magnitude.

Hi Jens, 

I asked for your rationale; thanks for your answer.  

So, basically, a nomalized design requires too much use of INSERT?  

You're making an efficiency argument here, or maybe
ease-of-implementation assertion. For me, inserting one header row or
20 is the same coding effort (still need a loop).  I think transaction
throughput would be about the same if COMMIT is applied only to whole
messages.  

The OP wanted some basic design advice; he didn't say a simple,
straightforward design was too much work or too slow.  He in fact said, 

> > > local small NNTP server

Given that, ISTM that textbook SQL 101 advice is in order.  JSON should
wait until your assumptions are tested.  

> > For all the complexity JSON adds to the design, it adds exactly no
> > power: precisely the same effects can be achieved without it.  
> 
> Well sure, but you could say the same thing about indexes, couldn't
> you?

No.  Perhaps I should have been more explicit about what "complexity"
I was talking about. I'm saying you've added a user-visible aspect,
JSON, to the logical database design for him to cope with, but in no
way made the database capable of representing something it otherwise
couldn't.  

Indexes are the opposite: invisible affordances that don't affect the
database's logical design.  

> Encoding headers as JSON is pretty simple if you've got a JSON
> encoder handy

Perhaps.  It's still introducing an extraneous technology to the user's
problem domain.  

> > I see no advantage to introducing JSON to a system with no external
> > use for it.  
> 
> Hm, a number of database companies (such as my employer, Couchbase)
> and their customers would disagree with you :)

Please don't take this personally, because I don't intend insult.  For
years I worked on databases before Couchbase existed, and for which
using Couchbase even today would be considered a joke.  

DBMSs are used for all kinds of purposes by people well trained and
not, to good effect and bad.  The number who don't understand the basic
theory of what they're working with far exceeds those that do.  Half of
them are below average, and the average isn't very high.  

I'm sure you'll understand if popular opinion doesn't impress me.  

I get why you would do it your way.  In your estimation, given your
particular skills, you feel it would be easier to use the JSON API.  I
didn't want to leave unchallenged the impression "6 one, half-dozen the
other" impression that bring that into the mix is just a matter of
taste.  Simpler systems are better, I'm sure you'd agree.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Feature request: more robust handling of invalid UTF-16 data

2020-01-13 Thread Dennis Snell
I’d like to raise this issue again and give my support for what Maks Verver 
recommended in 
https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg110107.html


Independently I came to this bug while working on an issue in Simplenote’s 
Android app where our data was being corrupted when saved to sqlite inside the 
Android SDK. We received some invalid UTF-16 sequences and instead of rejecting 
them or decoding it properly sqlite is further mangling them and introducing 
more corruption.


Example:
We have a JSON document like this which we store in a table.


    {“content”: “\ud83c\udd70\ud83c(null)\udd71”,”tags":[]}


The JSON is well-formed but the sequence of UTF-16 code points is invalid. We 
have fixed our side of the equation which prevents creating this content, but 
we still receive from time to time the invalid sequence from older client 
libraries.


When sqlite reads this data two types of further corruption occur: reading 
beyond a code unit subsequence; and conflating high and low surrogates.


Reading beyond a code unit subsequence:


When the `TERM` was introduced[1] and updated[2] it appears to have been 
designed to assume that a string ends mid-surrogate but it does not attempt to 
address unpaired surrogates in the middle of an input text. In our case the 
`READ_UTF16BE` macro accepts the second `\ud83c` code unit and then consumes 
the following `\u0028` which is the separate and well-formed “(“. In turn this 
produces the more corrupted value of `\ud83c\udc28`, code point U+1F028, plus 
“null)” without the leading “(“.


Conflating high and low surrogates:


The `READ_UTF16__` macros both attempt to start processing surrogate pairs 
based on the `0xD800 <= c <= 0xE000` value of the input code unit. Because of 
this they will pick up on unpaired low surrogates, consume the next character, 
and then create a more corrupted Unicode string.


In our case, once we reach the `\udd71` the macro consumes the following quote, 
which in the JSON document closes the string, and puts them together as 
`\udd71\u0022` producing the invalid code point U+6C422. Moreover, because it 
consumed the string-ending quote it also corrupted the entire JSON document, as 
the new output resembles the following:


    {“content”: “\ud83c\udd70\ud83c\udc28ull)\ud971\udc22,”tags”:[]}


That is, we write this invalid Unicode sequence but valid JSON document into 
sqlite and read back an invalid Unicode sequence _and_ invalid JSON (see the 
missing quote before “tags”).


Supporting Unicode spec:


The Unicode specification[3] sections 3.2 and 3.9 speak to this situation and 
provides a specific comparable example:


    When a process interprets a code unit sequence which purports to be in a 
Unicode
    character encoding form, it shall treat ill-formed code unit sequences as 
an error
    condition and shall not interpret such sequences as characters.


    Furthermore, such a process must not treat any adjacent well-formed code 
unit
    sequences as being part of those ill-formed code unit sequences.


    For example, with the input UTF-8 code unit sequence , such a 
UTF-8
    conversion process must not return  or , because
    either of those outputs would be the result of interpreting a well-formed 
subsequence
    as being part of the ill-formed subsequence. The expected return value for 
such a
    process would instead be .


Supporing Maks’ suggestion to use the replacement character on error section 
23.8[4] provides the guidance:


    It [U+FFFD] can be substituted for any “unknown” character in another 
encoding that
    cannot be mapped in terms of known Unicode characters. It can also be used 
as one
    means of indicating a conversion error, when encountering an ill-formed 
sequence in
    a conversion between Unicode encoding forms.


Patching:


The `READ_UTF16__` macros thus should do not only what Maks proposed, which is 
to verify that the character following a surrogate half is also a surrogate 
half, but also to verify that we don’t start interpreting a surrogate sequence 
when encountering an unpaired low surrogate. I propose this change instead:


    #define READ_UTF16LE(zIn, TERM, c){
        c = (*zIn++);
        c += ((*zIn++)<<8);
        if( c>=0xDC00 && c<=0xE000 && TERM ) {
            c = 0xFFFD
        } else if( c>=0xD800 && TERM ){
            int c2 = (zIn[0] | (zIn[1] << 8));
            if ( c2>=0xDC00 && c2<0xE000) {
                zIn += 2;
                c = (c2&0x03FF) + ((c&0x003F)<<10) + (((c&0x03C0)+0x0040)<<10);
            } else {
                c = 0xFFFD;
            }
        }
    }



This will solve both problem of reading past an ill-formed surrogate sequence 
and of interpreting an ill-formed surrogate sequence. I love sqlite and I 
really appreciated how the code is laid out which made it so easy to find this 
macro in the source and identify the problem.


Dennis Snell
Automattic, Inc.


[1]: https://sqlite.org/src/info/19064d7cea


[sqlite] lemon - namespace support

2020-01-13 Thread Mark Olesen
I've fairly recently been using lemon for building several parsers in 
C++ and found what I believe to be a *minimalist* means of avoiding 
symbol clashes without adding bloat, or affecting C code generation.

- New '-e' command line option to define the code extension. By default 
this is 'c', but with this option can define somethinglike -ecxx etc for 
using a C++ compiler.

- New (optional) '%namespace' directive. This can be used to embed the 
'Parse*' routines into a C++ namespace. If the '%namespace' is not 
defined, there is no change in behaviour.
The namespace can be anonymous or contain multiple nested namespaces. 
For example,
 %namespace {}
 %namespace {ns1::ns2::ns3}

This makes it easy to generate lemon parsers for C++ without any 
potential symbol clashes, and it imposes no C++ interface on the user.
It does not fundamentally change how lemon works.

I hope that the changes are useful enough for broader interest and/or 
sufficiently encapsulated that they could be incorporated into the sources.

Here are references to the patches:

https://develop.openfoam.com/Development/openfoam/blob/develop/wmake/src/lemon.c.patch

https://develop.openfoam.com/Development/openfoam/blob/develop/wmake/etc/lempar.c.patch


Cheers,
/mark
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Keith Medcalf

On Monday, 13 January, 2020 06:36, Dominique Devienne  
wrote:

> Please remind me, is the encoding a "client-side" setting, or also a
> "server-side" (i.e. stored) setting?

"pragma encoding " sets the internal storage format for text encoding in a 
database (on the "inside SQLite" side of the API demarcation).  It is 
persistent (a database attribute) and cannot be changed once the database has 
been created.

The boundary between "internal" and "external" is the SQLite3 API.  Thus you 
send "external" text into SQLite using the sqlite3_bind_text* 
sqlite3_result_text* interfaces.  The call specifies the format you are using 
for the "external" representation.  When the text gets "internal" (to the other 
side of the API) it is stored (and converted if necessary to) the "pragma 
encoding" setting of the database.  You retrieve the "internal" values into 
your external program using sqlite3_column_text* and sqlite3_value_text* 
interfaces, which converts (if necessary) the data from the "internal" encoding 
to the external encoding you requested.

https://sqlite.org/pragma.html#pragma_encoding

The length() function is an external function (as in on the "exernal" side of 
the demarcation).  It retrieves its arguments using the "sqlite3_value" 
interfaces from the internal storage format into the requested external storage 
format, does its thing, and then returns an external result back into the 
"internal" part of SQLite using the sqlite3_result* interface.

Contrast this with a function such as cast(...) which is strictly internal.  It 
operates entirely internally to SQLite directly on the internal data and does 
not retrieve the "internal data" into an external format then send the 
converted external data back into SQLite.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Capturing the changes in columns in a table

2020-01-13 Thread Jose Isaias Cabrera

Greetings!

Please observe the following,


create table t (n INTEGER PRIMARY KEY, a, b, c, d, e, idate);

insert into t (a, b, c, d, e, idate) values ('p001', 1, 2, 'n', 4, 
'2019-02-11');

insert into t (a, b, c, d, e, idate) values ('p002', 2, 2, 'n', 4, 
'2019-02-11');

insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'n', 4, 
'2019-02-11');

insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-11');

insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-11');

insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-12');

insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-12');

insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-12');

insert into t (a, b, c, d, e, idate) values ('p004', 4, 2, 'y', 4, 
'2019-02-12');

insert into t (a, b, c, d, e, idate) values ('p005', 5, 2, 'y', 4, 
'2019-02-12');

insert into t (a, b, c, d, e, idate) values ('p001', 10, 2, 'n', 4, 
'2019-02-13');

insert into t (a, b, c, d, e, idate) values ('p002', 2, 4, 'n', 4, 
'2019-02-13');

insert into t (a, b, c, d, e, idate) values ('p003', 3, 2, 'y', 4, 
'2019-02-13');

insert into t (a, b, c, d, e, idate) values ('p004', 5, 3, 'n', 4, 
'2019-02-13');

insert into t (a, b, c, d, e, idate) values ('p005', 5, 3, 'y', 8, 
'2019-02-13');


The SQL below provide info only for two dates (2019-02-11 and 2019-02-12):


select new.a,old.b,new.b, 'column b changed on ' || new.idate as info from t as 
new

  LEFT JOIN t as old ON

new.idate = '2019-02-12'

AND old.idate = '2019-02-11'

AND new.a = old.a

WHERE

  new.b != old.b

UNION ALL

select new.a,old.c,new.c, 'column c changed on ' || new.idate as info from t as 
new

  LEFT JOIN t as old ON

new.idate = '2019-02-12'

AND old.idate = '2019-02-11'

AND new.a = old.a

WHERE

  new.c != old.c

UNION ALL

select new.a,old.d,new.d, 'column d changed on ' || new.idate from t as new

  LEFT JOIN t as old ON

new.idate = '2019-02-12'

AND old.idate = '2019-02-11'

AND new.a = old.a

WHERE

  new.d != old.d

UNION ALL

select new.a,old.e,new.e, 'column e changed on ' || new.idate from t as new

  LEFT JOIN t as old ON

new.idate = '2019-02-12'

AND old.idate = '2019-02-11'

AND new.a = old.a

WHERE

  new.e != old.e

ORDER by new.a;


p001|1|10|column b changed on 2019-02-12
p002|2|4|column c changed on 2019-02-12
p003|n|y|column d changed on 2019-02-12
sqlite>

What I would like is to cover all of the dates in the same command, so that the 
output is this,

p001|1|10|column b changed on 2019-02-12
p002|2|4|column c changed on 2019-02-12
p003|n|y|column d changed on 2019-02-12
p004|4|5|column b changed on 2019-02-13
p004|2|3|column c changed on 2019-02-13
p004|y|n|column d changed on 2019-02-13
p005|2|3|column c changed on 2019-02-13
p005|4|8|column e changed on 2019-02-13

Yes, I know I can get all the dates and build the SQL as I did above, but is 
there a way to do this in one call?  Thanks for the help.

josé
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] System.Data.SQLite Verson 111

2020-01-13 Thread Urs Wagner

Hello
 
Can I download somewhere System.Data.SQLite Version 111 (the older version).
I should generete a edmx of an existing database.
 
Thanks Urs
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Best way to store key,value pairs

2020-01-13 Thread Petite Abeille


> On Jan 13, 2020, at 19:37, Jens Alfke  wrote:
> 
>> What is the motivation behind this advice?  It's completely unnecessary.  
> 
> Thanks for your opinion, James! I disagree. 

Arnt Gulbrandsen, of Archiveopteryx fame, would disagree with you, Jens :)

https://archiveopteryx.org/schema

In any case, this is not a matter of opinion, but rather one of purpose and 
benchmarks.

On that note, Happy New Year! Welcome to the future.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread David Raymond
Well, I believe this is the relevant bit from the docs for binding:
https://www.sqlite.org/c3ref/bind_blob.html

"If a non-negative fourth parameter is provided to sqlite3_bind_text() or 
sqlite3_bind_text16() or sqlite3_bind_text64() then that parameter must be the 
byte offset where the NUL terminator would occur assuming the string were NUL 
terminated. If any NUL characters occur at byte offsets less than the value of 
the fourth parameter then the resulting string value will contain embedded 
NULs. The result of expressions involving strings with embedded NULs is 
undefined. "


-Original Message-
From: sqlite-users  On Behalf Of 
Barry Smith
Sent: Monday, January 13, 2020 1:54 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Unexplained table bloat

On the original topic...

How does one end up with a database in this state? I.e with a binary value that 
contains 0x00 bytes followed by other bytes but a type of TEXT?

If the definition of a text string in SQLite is that it ends at the first 0x00 
byte, then it seems that anything stored as a text string should adhere to that.

So a database with a TEXT value that contains characters after the first 0x00 
should be considered corrupt. Given that to retrieve the actual contents of the 
cell it must be cast to BLOB, why not force the storage of any string that 
contains 0x00 as a BLOB in the first place?

What am I missing here?

On 13 Jan 2020, at 6:02 am, Simon Slavin  wrote:
> 
> On 13 Jan 2020, at 9:26am, Dominique Devienne  wrote:
> 
>> Which implies length(text_val) is O(N), while
>> length(blob_val) is O(1),
>> something I never quite realized.
> 
> For this reason, and others discussed downthread, some languages which store 
> Unicode strings store the number of graphemes as well as its contents.  So 
> functions which care about the … let's call it "width" … just retrieve that 
> number rather than having to parse the string to figure out the length.
> 
> In a Unicode string 'length' can mean
> 
> 1) octet count (number of 8-bit bytes used to store the string)
> 2) number of code points (basic unicode unit)
> 3) number of code units (how code points get arranged in UTF8, UTF16, etc., 
> not as simple as it looks)
> 4) length in graphemes (space-using units)
> 5) length in glyphs (font-rendering units)
> 
> and probably others I've forgotten.  Not to mention that I simplified the 
> definitions of the above and may have got them wrong.
> 
> An application centred around rendering text (e.g. vector graphics drawing 
> apps) might have each piece of text stored with all five of those numbers, 
> just to save it from having to constantly recalculate them.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Barry Smith
On the original topic...

How does one end up with a database in this state? I.e with a binary value that 
contains 0x00 bytes followed by other bytes but a type of TEXT?

If the definition of a text string in SQLite is that it ends at the first 0x00 
byte, then it seems that anything stored as a text string should adhere to that.

So a database with a TEXT value that contains characters after the first 0x00 
should be considered corrupt. Given that to retrieve the actual contents of the 
cell it must be cast to BLOB, why not force the storage of any string that 
contains 0x00 as a BLOB in the first place?

What am I missing here?

On 13 Jan 2020, at 6:02 am, Simon Slavin  wrote:
> 
> On 13 Jan 2020, at 9:26am, Dominique Devienne  wrote:
> 
>> Which implies length(text_val) is O(N), while
>> length(blob_val) is O(1),
>> something I never quite realized.
> 
> For this reason, and others discussed downthread, some languages which store 
> Unicode strings store the number of graphemes as well as its contents.  So 
> functions which care about the … let's call it "width" … just retrieve that 
> number rather than having to parse the string to figure out the length.
> 
> In a Unicode string 'length' can mean
> 
> 1) octet count (number of 8-bit bytes used to store the string)
> 2) number of code points (basic unicode unit)
> 3) number of code units (how code points get arranged in UTF8, UTF16, etc., 
> not as simple as it looks)
> 4) length in graphemes (space-using units)
> 5) length in glyphs (font-rendering units)
> 
> and probably others I've forgotten.  Not to mention that I simplified the 
> definitions of the above and may have got them wrong.
> 
> An application centred around rendering text (e.g. vector graphics drawing 
> apps) might have each piece of text stored with all five of those numbers, 
> just to save it from having to constantly recalculate them.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Best way to store key,value pairs

2020-01-13 Thread Jens Alfke


> On Jan 12, 2020, at 4:12 PM, James K. Lowden  wrote:
> 
> What is the motivation behind this advice?  It's completely unnecessary.  

Thanks for your opinion, James! I disagree. 

RFC822 headers are schemaless, and in a Usenet or email database they have 
rather high volume (probably 20+ per message) but go mostly unused. An NNTP 
server can't simply throw away the headers it doesn't need, but IMHO there are 
too many of them to do the usual SQL thing and add every header of every 
message to a table — that multiples the number of inserts and deletes by an 
order of magnitude.

> For all the complexity JSON adds to the design, it adds exactly no power: 
> precisely the same effects can be achieved without it.  

Well sure, but you could say the same thing about indexes, couldn't you?

I disagree about complexity. Encoding headers as JSON is pretty simple if 
you've got a JSON encoder handy, simpler than executing a SQL INSERT. And the 
JSON API is very easy to use.

> I see no advantage to introducing JSON to a system with no external use for 
> it.  

Hm, a number of database companies (such as my employer, Couchbase) and their 
customers would disagree with you :)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bug fixes only branch.

2020-01-13 Thread Syed Ahmad
We are at 3.14.2

Current version = 3.14.2 Date : 2016-09-12

https://www.sqlite.org/changes.html

how can i take latest stable branch which include only bug fixes . no new
features.

Is there any way?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] More job postings mention SQL than anything else

2020-01-13 Thread Simon Slavin
If you want to justify the hours you spend fiddling with SQLite you should feel 
a little happier today.



Graph from Burning Glass Analytics comparing demand for various buzzwords in 
job postings.  SQL is mentioned >10% more than any other computer language or 
technology.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Simon Slavin
On 13 Jan 2020, at 9:26am, Dominique Devienne  wrote:

> Which implies length(text_val) is O(N), while
> length(blob_val) is O(1),
> something I never quite realized.

For this reason, and others discussed downthread, some languages which store 
Unicode strings store the number of graphemes as well as its contents.  So 
functions which care about the … let's call it "width" … just retrieve that 
number rather than having to parse the string to figure out the length.

In a Unicode string 'length' can mean

1) octet count (number of 8-bit bytes used to store the string)
2) number of code points (basic unicode unit)
3) number of code units (how code points get arranged in UTF8, UTF16, etc., not 
as simple as it looks)
4) length in graphemes (space-using units)
5) length in glyphs (font-rendering units)

and probably others I've forgotten.  Not to mention that I simplified the 
definitions of the above and may have got them wrong.

An application centred around rendering text (e.g. vector graphics drawing 
apps) might have each piece of text stored with all five of those numbers, just 
to save it from having to constantly recalculate them.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Net provider not found

2020-01-13 Thread Urs Wagner


Hello
 
With the new version (112) I get this error. With 111 it works fine.
 
Der angeforderte .Net Framework-Datenprovider kann nicht gefunden werden. Er 
ist ggf. nicht installiert.
 
What is wrong? Until now it worked.
 
my packages.config is
 


  
  
  
  
  


  
    
    
  
  
    
  
    
  
    
    
  
  
    
  
  
    
  
  
    
  
    
    
  
  
    
    
  
    
  
  
    
  
  
    
 
  

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Dominique Devienne
On Mon, Jan 13, 2020 at 1:10 PM Keith Medcalf  wrote:
> If the register object contains "text" and you cast it to a blob (remove the 
> text affinity) you are left with just the bag-o-bytes, and length() will 
> return the size of the bag encoded in the register.  If the data in the 
> register is other than type "text" then it must be converted to text first 
> (in the database encoding) and then the cast will remove the text affinity, 
> after which the value returned by the length() function will be the number of 
> bytes in the bag that holds that text representation:
>
> sqlite> pragma encoding='utf-16';
> sqlite> create table x(x);
> sqlite> insert into x values ('text' || char(0) || 'text');
> sqlite> select x, typeof(x), length(x), length(cast(x as blob)) from x;
> text|text|4|18

Please remind me, is the encoding a "client-side" setting, or also a
"server-side" (i.e. stored) setting?

I wasn't sure whether pragma encoding='utf-16' affected the stored
state as well, or whether it was always in UTF-8
and SQLite was doing conversion on the fly for the client requested
encoding. I thought of lengthof() as the size stored
in the value header itself, which I assumed was always in bytes.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Richard Damon

On 1/13/20 5:24 AM, Dominique Devienne wrote:

On Mon, Jan 13, 2020 at 11:07 AM Keith Medcalf  wrote:

On Monday, 13 January, 2020 02:27, Dominique Devienne  
wrote:

I'd vote for a lengthof(col) that's always O(1) for both text and blob

So what should lengthof(something) return the number of bytes in the 'database 
encoding' or something else?

Bytes of course. Of the data stored, i.e. excluding the header byte
and encoded size (if any) from the file-format.
Basically the same as length() *except* for text values, resulting in
O(1) behavior. --DD

PS: I keep forgetting length(text_val) returns the number of
code-points in fact :)
PPS: Surrogate pairs count as one or two code points? That's just
bait, I don't really want to know :)))
Re the PPS, UTF-8 isn't allowed to have Surrogate Pairs. Non-BMP 
characters which would use Surrogate Pairs in UTF-16 are supposed to be 
converted to their fundamental 21 bit value and that encoded into UTF-8. 
If the code doesn't validate the data well enough to catch that issue, 
then I suspect the character counting would count each half of the 
surrogate pairs as a code-point,


--
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Keith Medcalf

If the register object contains "text" and you cast it to a blob (remove the 
text affinity) you are left with just the bag-o-bytes, and length() will return 
the size of the bag encoded in the register.  If the data in the register is 
other than type "text" then it must be converted to text first (in the database 
encoding) and then the cast will remove the text affinity, after which the 
value returned by the length() function will be the number of bytes in the bag 
that holds that text representation:

SQLite version 3.31.0 2020-01-12 23:30:01
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> pragma encoding;
UTF-8
sqlite> create table x(x);
sqlite> insert into x values ('text' || char(0) || 'text');
sqlite> insert into x values (3.14159);
sqlite> select x, typeof(x), length(x), length(cast(x as blob)) from x;
text|text|4|9
3.14159|real|7|7
sqlite> .q

SQLite version 3.31.0 2020-01-12 23:30:01
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> pragma encoding='utf-16';
sqlite> pragma encoding;
UTF-16le
sqlite> create table x(x);
sqlite> insert into x values ('text' || char(0) || 'text');
sqlite> insert into x values (3.14159);
sqlite> select x, typeof(x), length(x), length(cast(x as blob)) from x;
text|text|4|18
3.14159|real|7|14
sqlite> .q

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread R Smith

On 2020/01/13 12:24 PM, Dominique Devienne wrote:
Bytes of course. Of the data stored, i.e. excluding the header byte 


I checked, I was apparently correct about the casting. This following 
extract from a DB I got from a forum member with Greek-to-Danish 
translations where I added length checks:


  -- SQLite version 3.30.1  [ Release: 2019-10-10 ]  on SQLitespeed 
version 2.1.3.11.
  -- 



SELECT Glose_ID, Græsk, Dansk, length(Dansk) AS len, length(CAST(Dansk 
AS BLOB)) AS bytelen

  FROM Gloser
 WHERE 1 LIMIT 20;


  --   Glose_ID   | Græsk  | 
Dansk    | len | bytelen
  --  | -- | 
 | --- | ---
  --   1  | απόδειξη η | 1. bevis  2. 
kvittering  |  23 |    23
  --   2  | δεικτική αντωνυμία | påpegende stedord, 
demonstrativ pronomen |  40 | 41
  --   3  | δεικτικός-ή-ό  | 
påpegende    | 9  |    10
  --   4  | καθόλου    | slet (ikke), overhovedet 
(ikke)  |  31 |    31
  --   5  | κώδικας ο  | 1. kode 2. 
lov   |  14 |    14
  --   6  | πλην   | (mat) minus; (præp + G) 
undtagen |  32 |    33
  --   7  | προσδιορίζω    | præcisere, fastsætte, bestemme 
(nøjagtigt)   |  42 |    45
  --   8  | προσδιορισμός ο    | fastsættelse, (nøjagtig) 
bestemmelse |  36 |    38
  --   9  | προσδιορισής ο | ?? (- RH) foremntlig samme 
betydning som προσδιορισμός   |  54 | 67
  --  10  | φακός ο    | 1. linse  2. glas, brilleglas  
3. lygte, lommelygte  |  51 |    51
  --  11  | βρώμικος-η-ο   | snavset, beskidt; (fig) 
lyssky, skummel  |  39 | 39
  --  12  | δε μου λες...  | sig mig engang… (egl “du 
fortæller mig ikke”)    |  45 |    52
  --  13  | δημοσιογράφος ο/η  | 
journalist   | 10 |    10
  --  14  | κέρμα το   | mønt, 
småmønt    |  13 |    16
  --  15  | κοπέλα η   | 
pige | 4  |    4
  --  16  | μαθήτρια η | elev, skoleelev 
(kvindelig)  |  27 |    27
  --  17  | μαθητής ο  | elev, skoleelev 
(mandlig)    |  25 |    25
  --  18  | μπουκάλι το    | 
flaske   | 6  |    6
  --  20  | ποτήρι το  | glas (Både glas og flaske er 
intetkøn som vin. Øl er hunkøn) |  60 |    64
  --  21  | ίσιος-α-ο  | 
lige | 4  |    4



Hope that format is not too messed up by my e-mailer.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread R Smith


On 2020/01/13 12:24 PM, Dominique Devienne wrote:

On Mon, Jan 13, 2020 at 11:07 AM Keith Medcalf  wrote:

On Monday, 13 January, 2020 02:27, Dominique Devienne  
wrote:

I'd vote for a lengthof(col) that's always O(1) for both text and blob

So what should lengthof(something) return the number of bytes in the 'database 
encoding' or something else?

Bytes of course. Of the data stored, i.e. excluding the header byte
and encoded size (if any) from the file-format.
Basically the same as length() *except* for text values, resulting in
O(1) behavior. --DD



Please have mercy - my beautiful text-extracty queries will no longer work!

I'm OK with a second other function, perhaps called bytelength() that 
returns what you suggest, but length is mostly used in any text 
extracting or parsing along with with other functions like instr() and 
substr() etc. which all take character indexes based on positions of 
code-points and not of actual bytes.


Back in the day when all text was ANSI ASCII 8-byte Latin-English 
characters this was easy, but now it ain't.


That said, I thought that if you cast a string to BLOB and then query 
the length, like: SELECT length(CAST(firstname AS BLOB)) AS bytelen;
You essentially get what you asked for, but I've never used it, so am 
not sure.





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Dominique Devienne
On Mon, Jan 13, 2020 at 11:07 AM Keith Medcalf  wrote:
> On Monday, 13 January, 2020 02:27, Dominique Devienne  
> wrote:
> >> I'd vote for a lengthof(col) that's always O(1) for both text and blob
>
> So what should lengthof(something) return the number of bytes in the 
> 'database encoding' or something else?

Bytes of course. Of the data stored, i.e. excluding the header byte
and encoded size (if any) from the file-format.
Basically the same as length() *except* for text values, resulting in
O(1) behavior. --DD

PS: I keep forgetting length(text_val) returns the number of
code-points in fact :)
PPS: Surrogate pairs count as one or two code points? That's just
bait, I don't really want to know :)))
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Keith Medcalf

On Monday, 13 January, 2020 02:27, Dominique Devienne  
wrote:

>On Fri, Jan 10, 2020 at 7:03 PM Richard Hipp  wrote:

>> On 1/10/20, Dominique Devienne  wrote:

>>> There's no way at all, to know the length of a text column with
>>> embedded NULLs?

>> You can find the true length of a string in bytes from C-code using
>> the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
>> of a way to do that from SQL.

> That's what I thought. Which implies length(text_val) is O(N), while
> length(blob_val) is O(1), something I never quite realized. And this 
> despite both storing the length the same way at the file-format level. 
> That's kind of a gotcha, which might be worth documenting.

No, they are measured in different units.  

length(something) where something is NOT a text value returns the length in 
BYTES of that something.  For a BLOB this is the number of bytes according to 
the length data stored in the database.  For NULL, INTEGER, or REAL values, it 
is the number of BYTES in the ASCII text representation of that thing (which 
also happens to be the number of characters since each ASCII character takes 
one byte).

length(something) where something IS a text value returns the number of 
codepoints in the UTF-8 representation of that text and counting stops at the 
zero terminator (since the definition of "text" is a C string, all text values 
have an appended 0 terminator -- if there is an embedded 0 character, that 
terminates the counting).

>> I'd vote for a lengthof(col) that's always O(1) for both text and blob
>> values, although I'm not holding my breath for it, as I suspect it's 
>> unlikely to be added, given its relative low value.

So what should lengthof(something) return the number of bytes in the 'database 
encoding' or something else?

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Unexplained table bloat

2020-01-13 Thread Dominique Devienne
On Fri, Jan 10, 2020 at 7:03 PM Richard Hipp  wrote:

> On 1/10/20, Dominique Devienne  wrote:
> > There's no way at all, to know the length of a text column with embedded 
> > NULLs?
>
> You can find the true length of a string in bytes from C-code using
> the sqlite3_column_bytes() interface.  But I cannot, off-hand, think
> of a way to do that from SQL.

That's what I thought. Which implies length(text_val) is O(N), while
length(blob_val) is O(1),
something I never quite realized. And this despite both storing the
length the same way at the
file-format level. That's kind of a gotcha, which might be worth documenting.

I'd vote for a lengthof(col) that's always O(1) for both text and blob
values, although I'm not
holding my breath for it, as I suspect it's unlikely to be added,
given its relative low value.

I don't disagree with Keith that text shouldn't contain embedded
NULLs, and that C-String are
by design and definition that way, I was more thinking of the C++ use
case of having an std::string
with embedded NULLs, which is perfectly OK and common enough, and
using a C++ wrapper for
SQLite (which typically uses overloading for binding for example),
which will insert a text value for that
case, using .c_str() + .length() (or .data() + .size(), doesn't
matter, ends up the same), leading to the
very issue that started this thread (just a guess).

The inability to correctly size a value in SQL (and thus a column, via
a sum() group by)
for text with embedded nulls is quite unfortunate. And the fact
length(text_col) is also O(N)
is similarly unfortunate. Thus the above idea of an O(1)
lengthof(col), as a companion to typeof(col).

Thanks, --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users