Re: [sqlite] Row values with IN

2019-04-23 Thread Luuk


On 23-4-2019 18:08, Simon Slavin wrote:

On 23 Apr 2019, at 4:14pm, Charles Leifer  wrote:


SELECT * FROM info WHERE (year,month,day) IN ((2019, 1, 1), (2019, 2, 1));

Perhaps the row-value doc could clarify the behavior of IN with row values?

This


((2019, 1, 1), (2019, 2, 1))


This works:

SELECT * FROM info WHERE (year,month,day) = (2019, 1, 1) OR (year,month,day)
= (2019, 2, 1);

but it's not using IN, and more characters to type


is not a list of scalers, it's a list of lists of scalers.  So the 
documentation does explain it.  However, I think that the documentation doesn't 
sufficiently explain the problem.  Or maybe the use of IN or lists deserves its 
own page rather than being buried in the extremely long page on expressions.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me fix the SQLite Git mirror

2019-04-23 Thread Jens Alfke


> On Apr 23, 2019, at 9:17 AM, Wout Mertens  wrote:
> 
> Note that many git clients (https://git-scm.com/downloads/guis/ 
> ) do allow
> you to see the entire DAG, just not GitHub. If you use a Mac, a
> particularly great one is https://gitup.co 

My preference is Fork > (Mac/Win, 
free), which has a slightly more traditional DAG view, i.e. a table view with 
the graph displayed in the left column. (GitUp has some great features, but 
most of the time I want to look at more than one commit message at a time.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me fix the SQLite Git mirror

2019-04-23 Thread Wout Mertens
On Tue, Apr 23, 2019 at 1:22 PM Richard Hipp  wrote:

> The inability to see the entire DAG on a single screen in GitHub is a
> persistent source of annoyance
> to users like me who are accustomed to Fossil.
>

Note that many git clients (https://git-scm.com/downloads/guis/) do allow
you to see the entire DAG, just not GitHub. If you use a Mac, a
particularly great one is https://gitup.co

Wout.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Best way to ALTER COLUMN ADD multiple tables if the columns don't already exist

2019-04-23 Thread Jens Alfke


> On Apr 22, 2019, at 6:37 PM, Tommy Lane  wrote:
> 
>   How  would I go about updating an existing table to also have a
> parent integer and child integer column.

Use `pragma user_version` to get/set a schema version number in your database. 
It defaults to zero.
Associate each schema change with a user_version value. So call this one `1`.

When you open a db, check its user_version against the version# of each 
supported schema, in order. If it’s one less, apply that schema change, 
increment the user_version, and check the next one. That way all applicable 
schema updates will be applied in order.

In the case of this specific change, you’d use an ALTER TABLE ADD COLUMN 
statement to add the `parent` and `child` columns. All rows will now have the 
default values of those columns; if that’s not appropriate, you’ll need to run 
some UPDATE statements to initialize them to the proper values.

(Nitpick: You don’t need both `parent` and `child`; they’re redundant. In a 
properly normalized schema, you’d only have one; you can go the other direction 
by querying. My quick 2¢ opinion is to keep `child`. That way you can find 
current versions of entries with `child is null`.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Simon Slavin
On 23 Apr 2019, at 4:14pm, Charles Leifer  wrote:

> SELECT * FROM info WHERE (year,month,day) IN ((2019, 1, 1), (2019, 2, 1));
> 
> Perhaps the row-value doc could clarify the behavior of IN with row values?

This

> ((2019, 1, 1), (2019, 2, 1))

is not a list of scalers, it's a list of lists of scalers.  So the 
documentation does explain it.  However, I think that the documentation doesn't 
sufficiently explain the problem.  Or maybe the use of IN or lists deserves its 
own page rather than being buried in the extremely long page on expressions.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_MAX_MMAP_SIZE 2GB default

2019-04-23 Thread Carl Edquist

Thanks Jens!

On Mon, 22 Apr 2019, Jens Alfke wrote:

But yeah, I agree with you that it seems odd to have a compiled-in 
restriction on the maximum memory-map size.


I looked a bit into the history, and it appears the just-under-2GB limit 
was specifically put there (by drh) so that the value would fit into a 
signed 32-bit integer:


https://www.sqlite.org/src/info/460752b857532016

Maybe this was in order to fit into a ssize_t on 32-bit platforms?  (Just 
a guess as the size_t type is used for mmap.)


If that's the reason, would drh & the sqlite team consider changing the 
default SQLITE_MAX_MMAP_SIZE definition from 0x7fff to SSIZE_MAX 
(defined in limits.h) ?



...


Somewhat of an aside:

Most current OSs have a universal buffer cache, wherein filesystem 
caches and VM pages use the same RAM buffers. A page-fault and a file 
read will incur similar amounts of work. The big benefit is that the 
memory-mapped pages can be evicted from RAM when needed for other stuff, 
whereas a malloc-ed page cache is considered dirty and has to be swapped 
out before the RAM page can be reused.


Right, i think we're agreeing here.  I just mean that if a db file is much 
larger than the available ram, and all of it is used for a query, then 
whether bytes come in via mmap+memcpy() or regular file read(), the kernel 
will page-in missing pages into its page cache, and (at its discretion) it 
will eventually have to evict those pages to make room for others.  (I 
didn't mean to make reference to disk swap or sqlite's user-space 
pagecache.)  The point is that the kernel page loading/evicting part is 
the same, but the mmap also saves the overhead for the seek/read system 
calls.


And i say this from my personal experience with sqlite -- when doing a 
long running join on a db that does not fit into ram, with the timer on, i 
observed that in the regular non-mmap mode, the system time for the query 
would be about equal to the user time.  But when i mmap'ed the whole file, 
the system time for the query was < 1% of the user time, and the user time 
was also less than it had been in non-mmap mode.  I can only guess the 
huge difference in the system time was mainly the kernel handling all the 
seek/read system calls, vs just memcpy when the whole db file is mmap'ed.


The bottom line in any case is i saw a substantial speedup for long 
queries when mmap'ing a large db, even when it did not fit into ram.



Thanks..!

Carl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Charles Leifer
Thanks, makes sense. I think what confused me is that in the doc for row
values, it states:

Two row values of the same size can be compared using operators <, <=, >,
>=, =, <>, IS, IS NOT, IN, NOT IN, BETWEEN, or CASE.

The row value doc example shows how to use BETWEEN, for instance:

SELECT * FROM info WHERE (year,month,day) BETWEEN (2015,9,12) AND
(2016,9,12);

Based on the above doc, I would have also thought it would be possible to
write something like:

SELECT * FROM info WHERE (year,month,day) IN ((2019, 1, 1), (2019, 2, 1));

Perhaps the row-value doc could clarify the behavior of IN with row values?

SELECT * FROM info WHERE (year,month,day) IN (VALUES (2019, 1, 1), (2019,
2, 1));

On Tue, Apr 23, 2019 at 10:03 AM Simon Slavin  wrote:

> On 23 Apr 2019, at 3:52pm, Charles Leifer  wrote:
>
> > My question, though, is why is the VALUES bit needed for the WHERE
> clause in this case?
>
> 
>
> " If the right operand of an IN or NOT IN operator is a list of values,
> each of those values must be scalars and the left expression must also be a
> scalar. "
>
> So to use lists, you can't use pairs of values, you must use scalars.
> Further up the same paragraph
>
> " When the right operand of an IN or NOT IN operator is a subquery, the
> subquery must have the same number of columns as there are columns in the
> row value of the left operand. The subquery on the right of an IN or NOT IN
> operator must be a scalar subquery if the left expression is not a row
> value expression. "
>
> So what seems to be happening is that your use of VALUES is turning the
> comparator into a subquery rather than a list of values.  And SQLite is
> testing to see whether your search term satisfies the subquery.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Simon Slavin
On 23 Apr 2019, at 3:52pm, Charles Leifer  wrote:

> My question, though, is why is the VALUES bit needed for the WHERE clause in 
> this case?



" If the right operand of an IN or NOT IN operator is a list of values, each of 
those values must be scalars and the left expression must also be a scalar. "

So to use lists, you can't use pairs of values, you must use scalars.  Further 
up the same paragraph

" When the right operand of an IN or NOT IN operator is a subquery, the 
subquery must have the same number of columns as there are columns in the row 
value of the left operand. The subquery on the right of an IN or NOT IN 
operator must be a scalar subquery if the left expression is not a row value 
expression. "

So what seems to be happening is that your use of VALUES is turning the 
comparator into a subquery rather than a list of values.  And SQLite is testing 
to see whether your search term satisfies the subquery.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Charles Leifer
This is a question about SQL language support and the handling of
row-values, and I thought my question was quite clear. I appreciate your
offers to help, but they do not have anything to do with the question I am
asking.

As I wrote:

My question, though, is why is the VALUES bit needed for the WHERE clause
in this case? Reading the docs on row values, it seemed to me that
comparing two row values should work fine:

Two row values of the same size can be compared using operators <, <=, >,
>=, =, <>, IS, IS NOT, IN, NOT IN, BETWEEN, or CASE.

Is there a reason I need to specify VALUES instead of simply passing in a
set of row tuples? It seems a little bit inconsistent to me. Thanks!

On Tue, Apr 23, 2019 at 9:49 AM Simon Slavin  wrote:

> On 23 Apr 2019, at 3:42pm, Charles Leifer  wrote:
>
> > Thanks Simon -- this is just a minimal example. The WHERE clause is
> needed because, unless I would provide a CASE WHEN for every single (key,
> value), then the UPDATE would set the "extra" value of any unmatched key,
> value to NULL.
>
> You can use ELSE:
>
> UPDATE "bu" SET "extra" = CASE "key"||"value"
> WHEN 'k1'||1 THEN 100
> WHEN 'k2'||2 THEN -200
> WHEN 'k3'||3 THEN 30
> ELSE "extra"
> END;
>
> But I still think that using multiple UPDATE statements will still be far
> faster.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Simon Slavin
On 23 Apr 2019, at 3:42pm, Charles Leifer  wrote:

> Thanks Simon -- this is just a minimal example. The WHERE clause is needed 
> because, unless I would provide a CASE WHEN for every single (key, value), 
> then the UPDATE would set the "extra" value of any unmatched key, value to 
> NULL.

You can use ELSE:

UPDATE "bu" SET "extra" = CASE "key"||"value"
WHEN 'k1'||1 THEN 100
WHEN 'k2'||2 THEN -200
WHEN 'k3'||3 THEN 30
ELSE "extra"
END;

But I still think that using multiple UPDATE statements will still be far 
faster.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Charles Leifer
Thanks Simon -- this is just a minimal example. The WHERE clause is needed
because, unless I would provide a CASE WHEN for every single (key, value),
then the UPDATE would set the "extra" value of any unmatched key, value to
NULL.

Please remember this is just an example. I'd like to generalize this kind
of approach. It works fine for scalar primary keys, but is having issues
when the key is composed of multiple columns.

On Tue, Apr 23, 2019 at 9:35 AM Simon Slavin  wrote:

> On 23 Apr 2019, at 3:11pm, Charles Leifer  wrote:
>
> > UPDATE "bu" SET "extra" = CASE ("key", "value")
> >  WHEN ('k1', 1) THEN 100
> >  WHEN ('k2', 2) THEN -200
> >  WHEN ('k3', 3) THEN 30
> > END
> > WHERE ("key", "value") IN (('k1', 1), ('k2', 2), ('k3', 3));
>
> Your WHERE clause it not needed, and SQLite is not clever enough to
> realise it can be used to find specific rows in the table.  (At least
> that's what I think, I may be wrong.).
>
> I'm not certain that SQLite understands your use of the bracketed terms.
> This
>
> UPDATE "bu" SET "extra" = CASE "key"||"value"
>  WHEN 'k1'||1 THEN 100
>  WHEN 'k2'||2 THEN -200
>  WHEN 'k3'||3 THEN 30
> END;
>
> might work because || is the 'append' operator.  Of course this assumes
> that there are no ambiguities like 'k451' which might be key 'k4' or k45.
> It might be better to use something like "key"|'x'|"value" .
>
> But in fact this would be far faster
>
> BEGIN;
> UPDATE "bu" SET "extra" = 100 WHERE "key"='k1' AND "value"=1;
> UPDATE "bu" SET "extra" = -200 WHERE "key"='k2' AND "value"=2;
> UPDATE "bu" SET "extra" = 30 WHERE "key"='k3' AND "value"=3;
> END;
>
> because your WHERE clause matches a UNIQUE key, so SQLite could go
> straight to the correct row rather than iterating through the entire
> table.  So instead of one command that has to read every row of the table
> you have three commands which each execute almost instantly.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Row values with IN

2019-04-23 Thread Simon Slavin
On 23 Apr 2019, at 3:11pm, Charles Leifer  wrote:

> UPDATE "bu" SET "extra" = CASE ("key", "value")
>  WHEN ('k1', 1) THEN 100
>  WHEN ('k2', 2) THEN -200
>  WHEN ('k3', 3) THEN 30
> END
> WHERE ("key", "value") IN (('k1', 1), ('k2', 2), ('k3', 3));

Your WHERE clause it not needed, and SQLite is not clever enough to realise it 
can be used to find specific rows in the table.  (At least that's what I think, 
I may be wrong.).

I'm not certain that SQLite understands your use of the bracketed terms.  This

UPDATE "bu" SET "extra" = CASE "key"||"value"
 WHEN 'k1'||1 THEN 100
 WHEN 'k2'||2 THEN -200
 WHEN 'k3'||3 THEN 30
END;

might work because || is the 'append' operator.  Of course this assumes that 
there are no ambiguities like 'k451' which might be key 'k4' or k45.  It might 
be better to use something like "key"|'x'|"value" .

But in fact this would be far faster

BEGIN;
UPDATE "bu" SET "extra" = 100 WHERE "key"='k1' AND "value"=1;
UPDATE "bu" SET "extra" = -200 WHERE "key"='k2' AND "value"=2;
UPDATE "bu" SET "extra" = 30 WHERE "key"='k3' AND "value"=3;
END;

because your WHERE clause matches a UNIQUE key, so SQLite could go straight to 
the correct row rather than iterating through the entire table.  So instead of 
one command that has to read every row of the table you have three commands 
which each execute almost instantly.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Row values with IN

2019-04-23 Thread Charles Leifer
Let's say I have a simple table with a composite primary key (key, value)
and an extra field:

CREATE TABLE IF NOT EXISTS "bu" (
  "key" TEXT NOT NULL,
  "value" INTEGER NOT NULL,
  "extra" INTEGER NOT NULL,
  PRIMARY KEY ("key", "value"));

I'll put 3 rows in the table:

INSERT INTO "bu" ("key", "value", "extra")
VALUES ('k1', 1, 1), ('k2', 2, 2), ('k3', 3, 3);

Now -- I want to do a "bulk update":

UPDATE "bu" SET "extra" = CASE ("key", "value")
  WHEN ('k1', 1) THEN 100
  WHEN ('k2', 2) THEN -200
  WHEN ('k3', 3) THEN 30
END
WHERE ("key", "value") IN (('k1', 1), ('k2', 2), ('k3', 3));

I run into a "row value misused" error triggered by the WHERE clause, which
is comparing the (key, value) tuple to a set of row-values.

I've found that I can get this to work by using explicitly using VALUES:

UPDATE "bu" SET "extra" = CASE ("key", "value") ... END
WHERE ("key", "value") IN (VALUES ('k1', 1), ('k2', 2), ('k3', 3));

My question, though, is why is the VALUES bit needed for the WHERE clause
in this case? Reading the docs on row values, it seemed to me that
comparing two row values should work fine:

Two row values of the same size can be compared using operators <, <=, >,
>=, =, <>, IS, IS NOT, IN, NOT IN, BETWEEN, or CASE.

Is there a reason I need to specify VALUES instead of simply passing in a
set of row tuples? It seems a little bit inconsistent to me. Thanks!

Charlie
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me fix the SQLite Git mirror

2019-04-23 Thread Richard Hipp
On 4/23/19, Rowan Worth  wrote:
>
> How can you justify the claim that history was unchanged on trunk between
> time (1) and time (2)?

Short answer:  I look at the entire DAG, not individual branches.

Long answer:

From prior discussion and feedback on this thread, I've come to
realize that there is a philosophical difference between Fossil and
Git.  Both systems use a Directed Acyclic Graph of check-ins (the DAG)
as their underlying data structure.  But in Git, the fundamental unit
of management is a branch, which Git defines as a linear sequence of
check-ins from a leaf back to the root, whereas in Fossil, the
fundamental unit of management is the entire DAG.

Git-ers think in terms of branches.  And so if you move a node from
one branch to another, that changes history.  But Fossil-ers think in
terms of the entire DAG.  Relabeling a node to be on a different
branch is just a notational convenience and does not change the DAG in
any way, and is hence not a history change.

Additional observations that support of this dichotomy:

(1) The default synchronization unit in Git is a single branch.  You
can sync the entire DAG using the --mirror option, but that is rarely
done in practice it seems.  In Fossil, the default and only
synchronization option is the entire DAG.

(2) The primary history display in Git/GitHub shows only a single
branch (ex: https://github.com/sqlite/sqlite/commits/master) whereas
the primary history display in Fossil shows the entire DAG (ex:
https://www.sqlite.org/src/timeline). The inability to see the entire
DAG on a single screen in GitHub is a persistent source of annoyance
to users like me who are accustomed to Fossil.

(3) Git puts special emphasis on rebase, which is used to transfer or
replicate a span of check-ins from one branch into another.  Having
all the important check-ins on a single branch is important if your
focus is on that one branch.  Fossil, in contrast, keeps the entire
DAG in view all at once, and so is no issue with keeping span of
check-ins remain on a separate arm of the DAG and merely merge in the
changes.

The Git approach makes sense in a highly distributed project like the
Linux Kernel, where there are many thousands of developers each
working on their own branches and where it is desirable to prune the
DAG down to a more manageable size by omitting extraneous arms.  The
Fossil approach works better for cathedral-style development where
there is a small team of contributors, all of which know each other
and work together daily, and where developers want to keep track of
everything that is going on, for improved situational awareness and
project coherence.

SQLite Git Mirror Update:

I made enhancements last night so that any future branch relabelings
that occur in SQLite should automatically be mirrored into the Git
repository on GitHub.  (Aside: See the interesting 5-way join used to
determine which "git update-ref" commands are needed on each
incremental export here:
https://www.fossil-scm.org/fossil/artifact/b0ace47d4c6a?ln=1492-1505)
There are definite management advantages to relabeling branches, and
so I will not preclude such actions in SQLite moving forward, though I
will try to keep relabelings to a minimum. Nevertheless, users of the
Git mirror should keep in mind that I think in terms of the entire
DAG, not individual branches, and so if you are tracking SQLite
development in a Git clone, you should to take steps to ensure that
you do not find yourself stalled on a side-tracked branch.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users