Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Gabor Grothendieck
The difference between these two is what happens when a row
of A has no matches in B.

select * from A left join B on A.Time = B.Time
select * from A left join B where A.Time = B.Time

In the first one the condition is carried out during the join
so if a row of A has no matches in B then the B part of the result
is filled out with NULL values.

In the second one A is joined with B and then the "where" is applied
afterwards so if there is no match to a row of A then that row of
A  will not appear at all in the result.

Here is an example in R,  Note that there was no match to the
third row of A in B so in the first case that row of A appears in the
result with
NULLs in the B column positions.  In the second case the third row of A
is dropped from the result.

> library(sqldf)
>
> # BOD comes with R
> A <- BOD[1:3, ]
> B <- BOD[1:2, ]
> A
  Time demand
118.3
22   10.3
33   19.0
> B
  Time demand
118.3
22   10.3
>
> sqldf("select * from A left join B on A.Time = B.Time")
  Time demand Time..3 demand..4
118.3   1   8.3
22   10.3   2  10.3
33   19.0  NANA
> sqldf("select * from A left join B where A.Time = B.Time")
  Time demand Time..3 demand..4
118.3   1   8.3
22   10.3   2  10.3


On Sun, Oct 27, 2019 at 6:09 PM Benjamin Asher  wrote:
>
> Hi there! My colleagues and I are trying to understand the role of ON 
> constraints vs WHERE clauses in JOINs. It seems both of the following work, 
> but I’m not really sure why:
>
> Query A:
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;
>
>
> Query B:
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant’;
>
> Is there a difference between the two (function and performance)? Is there an 
> advantage to putting WHERE-type filtering in the ON constraint vs leaving it 
> in the WHERE clause for LEFT JOINs?
>
> Thanks!
>
> Ben
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Benjamin Asher
Ah! I see. Thanks for that walk through of OUTER JOIN.

Ben

> On Oct 27, 2019, at 4:18 PM, Keith Medcalf  wrote:
> 
> 
> On Sunday, 27 October, 2019 16:09, Benjamin Asher  
> wrote:
> 
>> Is there an advantage to putting WHERE-type filtering in the 
>> ON constraint vs leaving it in the WHERE clause for LEFT JOINs?
> 
> The ON clause is merely syntactic sugar for a parenthesized AND clause 
> appended to the parenthesized WHERE clause EXCEPT in the case of an OUTER 
> JOIN.
> 
> That is when you emit:
> 
> SELECT 
>  FROM 
>  JOIN 
>ON 
> WHERE 
> 
> This is really nothing more than:
> 
> SELECT 
>  FROM , 
> WHERE ()
>   AND ()
> 
> This applies no matter how many  there are, nor how may ON 
>  there are.  The ON clause does not even have to mention tables 
> that have already appeared so long as they eventually appear so the WHERE 
> clause can be processed.  JOIN or INNER JOIN is merely an alternate spelling 
> of a comma, and ON merely puts its argument conditional expression in 
> parenthesis and tacks it onto the end of the WHERE conditions with an AND 
> (the where condition clause itself being parenthesized).
> 
> In the case of an OUTER JOIN the ON condition specifies the selection 
> constraint for descent into the immediately preceding table (that is, the ON 
> clause binds to the table that is the target of the join).  If the selection 
> criteria in that descent condition cannot be met, that table tuple is 
> replaced with a tuple of all NULL.
> 
> Once upon a time there was no ON clause, and one would specify outer 
> constraints with a *.  dibble *= dabble means that you want all the values of 
> dibble even if there is no matching dabble, and the dabble row is therefore 
> comprised of all nulls (this is a left outer join).  Similarly dibble =* 
> dabble meant that one wants all the dabble even if there is no matching 
> dibble, in which case the tuple from which the dibble came will be all nulls 
> (right outer join).  There was also a dibble *=* dabble which meant that you 
> wanted all the results where dibble matched dabble, plus the ones where 
> dibble had no match and the tuple from when dabble came was therefore all 
> nulls, and the ones where dabble had no match in which case the tuple from 
> which dibble came was all nulls (full outer join).
> 
> The JOIN .. ON semantics were invented because some people had difficulty 
> composing appropriate WHERE clauses without it.
> 
> So for OUTER JOINS (of which a LEFT JOIN is but an instance of the entire 
> class) whether a condition appears in the ON clause or the WHERE clause is 
> material to how the query is performed and the results obtained.  For an 
> INNER JOIN, it does not matter whether the condition is specified in an ON 
> clause or in the WHERE clause.
> 
> In fact, in the case of INNER JOIN you do not need either keyword at all:
> 
> SELECT a, b
>  FROM x, y
>ON x.a == y.c
> WHERE y.q == 5
> 
> is the same as:
> 
> SELECT a, b
>  FROM x, y
>ON y.q == 5
> WHERE x.a == y.c
> 
> is the same as
> 
> SELECT a, b
>  FROM X JOIN Y
> WHERE x.a == y.c
>   AND y.q == 5
> 
> which is really just:
> 
> SELECT a, b
>  FROM x, y
> WHERE x.a == y.c
>   AND y.q == 5
> 
> however, for outer joins:
> 
> SELECT a, b
>  FROM x LEFT JOIN y
>ON x.a == y.c
> WHERE y.q == 5
> 
> is the same as:
> 
> SELECT a, b
>  FROM x, y
> WHERE x.a == y.c
>   AND y.q == 5
> 
> (that is, because the WHERE clause requires that y.q not be NULL, the OUTER 
> JOIN is meaningless and merely results is more processing CPU and memory 
> usage than is necessary since the OUTER JOIN is really just an inner join).
> 
> SELECT a, b
>  FROM x LEFT JOIN y
>ON x.a == x.c and y.q == 5
> 
> gives a completely different set of results.
> 
> -- 
> The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
> lot about anticipated traffic volume.
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Keith Medcalf

On Sunday, 27 October, 2019 16:09, Benjamin Asher  wrote:

> Is there an advantage to putting WHERE-type filtering in the 
> ON constraint vs leaving it in the WHERE clause for LEFT JOINs?

The ON clause is merely syntactic sugar for a parenthesized AND clause appended 
to the parenthesized WHERE clause EXCEPT in the case of an OUTER JOIN.

That is when you emit:

SELECT 
  FROM 
  JOIN 
ON 
 WHERE 

This is really nothing more than:

SELECT 
  FROM , 
 WHERE ()
   AND ()

This applies no matter how many  there are, nor how may ON 
 there are.  The ON clause does not even have to mention tables 
that have already appeared so long as they eventually appear so the WHERE 
clause can be processed.  JOIN or INNER JOIN is merely an alternate spelling of 
a comma, and ON merely puts its argument conditional expression in parenthesis 
and tacks it onto the end of the WHERE conditions with an AND (the where 
condition clause itself being parenthesized).

In the case of an OUTER JOIN the ON condition specifies the selection 
constraint for descent into the immediately preceding table (that is, the ON 
clause binds to the table that is the target of the join).  If the selection 
criteria in that descent condition cannot be met, that table tuple is replaced 
with a tuple of all NULL.

Once upon a time there was no ON clause, and one would specify outer 
constraints with a *.  dibble *= dabble means that you want all the values of 
dibble even if there is no matching dabble, and the dabble row is therefore 
comprised of all nulls (this is a left outer join).  Similarly dibble =* dabble 
meant that one wants all the dabble even if there is no matching dibble, in 
which case the tuple from which the dibble came will be all nulls (right outer 
join).  There was also a dibble *=* dabble which meant that you wanted all the 
results where dibble matched dabble, plus the ones where dibble had no match 
and the tuple from when dabble came was therefore all nulls, and the ones where 
dabble had no match in which case the tuple from which dibble came was all 
nulls (full outer join).

The JOIN .. ON semantics were invented because some people had difficulty 
composing appropriate WHERE clauses without it.

So for OUTER JOINS (of which a LEFT JOIN is but an instance of the entire 
class) whether a condition appears in the ON clause or the WHERE clause is 
material to how the query is performed and the results obtained.  For an INNER 
JOIN, it does not matter whether the condition is specified in an ON clause or 
in the WHERE clause.

In fact, in the case of INNER JOIN you do not need either keyword at all:

SELECT a, b
  FROM x, y
ON x.a == y.c
 WHERE y.q == 5

is the same as:

SELECT a, b
  FROM x, y
ON y.q == 5
 WHERE x.a == y.c

is the same as

SELECT a, b
  FROM X JOIN Y
 WHERE x.a == y.c
   AND y.q == 5

which is really just:

SELECT a, b
  FROM x, y
 WHERE x.a == y.c
   AND y.q == 5

however, for outer joins:

SELECT a, b
  FROM x LEFT JOIN y
ON x.a == y.c
 WHERE y.q == 5

is the same as:

SELECT a, b
  FROM x, y
 WHERE x.a == y.c
   AND y.q == 5

(that is, because the WHERE clause requires that y.q not be NULL, the OUTER 
JOIN is meaningless and merely results is more processing CPU and memory usage 
than is necessary since the OUTER JOIN is really just an inner join).

SELECT a, b
  FROM x LEFT JOIN y
ON x.a == x.c and y.q == 5

gives a completely different set of results.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Ben Asher
This is exactly the kind of advice I was looking for.

Thanks again!

Ben

> On Oct 27, 2019, at 4:04 PM, Simon Slavin  wrote:
> 
> On 27 Oct 2019, at 10:45pm, Benjamin Asher  wrote:
> 
>> Query A
>> 
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE 
>> tab1.x='constant’;
>> 
>> Query B
>> 
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant'
> 
> Your use of JOIN here is extremely usual.  Depending on your data, you either 
> don't need it, or it does nothing.
> 
> Suppose you have many rows in tab1 where x='constant' and also many rows in 
> tab2 where x='constant'.  What are you expecting the engine to do with them ? 
>  Should it match one tab1 row with one tab2 row ?  Or should it generate a 
> set of every possible combination ?  This tells you what your ON clause 
> should be.
> 
> Generally, a constant belongs in a WHERE clause.  To answer your original 
> question
> 
>> Is there an advantage to putting WHERE-type filtering in the ON constraint 
>> vs leaving it in the WHERE clause for LEFT JOINs?
> 
> SQLite reads the entire statement and does things to figure out the best way 
> to execute it given the indexes available.  The kind of deconstruction you 
> did to the query in your head, happens inside the SQL engine.  So we cannot 
> give you an answer "always put it in ON" or "always put it in WHERE".  
> Instead we say "Don't make your query look weird just to save a millisecond.  
> Have your query make sense, so someone who reads it understands what you are 
> doing.".
> 
> If you want to worry about speed or efficiency, worry about making useful 
> indexes for your tables instead of exactly how to phrase your query.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Simon Slavin
On 27 Oct 2019, at 10:45pm, Benjamin Asher  wrote:

> Query A
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE 
> tab1.x='constant’;
> 
> Query B
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant'

Your use of JOIN here is extremely usual.  Depending on your data, you either 
don't need it, or it does nothing.

Suppose you have many rows in tab1 where x='constant' and also many rows in 
tab2 where x='constant'.  What are you expecting the engine to do with them ?  
Should it match one tab1 row with one tab2 row ?  Or should it generate a set 
of every possible combination ?  This tells you what your ON clause should be.

Generally, a constant belongs in a WHERE clause.  To answer your original 
question

> Is there an advantage to putting WHERE-type filtering in the ON constraint vs 
> leaving it in the WHERE clause for LEFT JOINs?

SQLite reads the entire statement and does things to figure out the best way to 
execute it given the indexes available.  The kind of deconstruction you did to 
the query in your head, happens inside the SQL engine.  So we cannot give you 
an answer "always put it in ON" or "always put it in WHERE".  Instead we say 
"Don't make your query look weird just to save a millisecond.  Have your query 
make sense, so someone who reads it understands what you are doing.".

If you want to worry about speed or efficiency, worry about making useful 
indexes for your tables instead of exactly how to phrase your query.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Benjamin Asher
Okay playing with this some more: it seems like since everything is equal to 
the same constant, it doesn’t really matter? That said, it feels like poor form 
/ not good practice (join can become not useful if you don’t have a constraint 
joining particular columns). If I want to make it easier on myself to adjust 
this query in the future, it’s better practice to move constants to the WHERE 
clause and ensure there are column-based join constraints.

Does that seem right? My colleague presented me with a related query to debug, 
and now I’m questioning everything.

Ben

> On Oct 27, 2019, at 3:45 PM, Benjamin Asher  wrote:
> 
> Oh you’re right. I realize now I messed up the example. Here are the updated 
> queries:
> 
> Query A
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE 
> tab1.x='constant’;
> 
> Query B
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant
> 
> These return the same results in your sample setup. I think Simon pointed out 
> what I’m also thinking is the problem: could Query A return rows that aren’t 
> matched up properly because of a lack of a constraint joining the 2 tables?
> 
> Thanks for putting together that more complete example :).
> 
> Ben
> 
>> On Oct 27, 2019, at 3:35 PM, Richard Hipp  wrote:
>> 
>> On 10/27/19, Benjamin Asher  wrote:
>>> It seems both of the following work,
>>> but I’m not really sure why:
>> 
>> I get different answers for the two queries when I try them:
>> 
>> CREATE TABLE tab1(x,y);
>> CREATE TABLE tab2(x,y);
>> INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
>> INSERT INTO tab2 SELECT * FROM tab1;
>> 
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';
>> 
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';
>> 
>> 
>> -- 
>> D. Richard Hipp
>> d...@sqlite.org
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Benjamin Asher
Oh you’re right. I realize now I messed up the example. Here are the updated 
queries:

Query A

SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE tab1.x='constant’;

Query B

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant

These return the same results in your sample setup. I think Simon pointed out 
what I’m also thinking is the problem: could Query A return rows that aren’t 
matched up properly because of a lack of a constraint joining the 2 tables?

Thanks for putting together that more complete example :).

Ben

> On Oct 27, 2019, at 3:35 PM, Richard Hipp  wrote:
> 
> On 10/27/19, Benjamin Asher  wrote:
>> It seems both of the following work,
>> but I’m not really sure why:
> 
> I get different answers for the two queries when I try them:
> 
> CREATE TABLE tab1(x,y);
> CREATE TABLE tab2(x,y);
> INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
> INSERT INTO tab2 SELECT * FROM tab1;
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';
> 
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';
> 
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Simon Slavin
On 27 Oct 2019, at 10:09pm, Benjamin Asher  wrote:

> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;

You're doing a JOIN here.  How should the engine know which row of tab2 
corresponds to which row of tab1 ?

Your query is syntactically correct, but it doesn't appear to do anything 
useful.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Richard Hipp
On 10/27/19, Benjamin Asher  wrote:
> It seems both of the following work,
> but I’m not really sure why:

I get different answers for the two queries when I try them:

CREATE TABLE tab1(x,y);
CREATE TABLE tab2(x,y);
INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
INSERT INTO tab2 SELECT * FROM tab1;

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] LEFT JOIN: ON vs. WHERE

2019-10-27 Thread Benjamin Asher
Hi there! My colleagues and I are trying to understand the role of ON 
constraints vs WHERE clauses in JOINs. It seems both of the following work, but 
I’m not really sure why:

Query A:

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;


Query B:

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant’;

Is there a difference between the two (function and performance)? Is there an 
advantage to putting WHERE-type filtering in the ON constraint vs leaving it in 
the WHERE clause for LEFT JOINs?

Thanks!

Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread D Burgess
Number one on my wishlist.
UNSIGNED
Wanted for ordering more than anything else
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] System.Data.SQLite version 1.0.112.0 released

2019-10-27 Thread Joe Mistachkin

System.Data.SQLite version 1.0.112.0 (with SQLite 3.30.1) is now available
on the System.Data.SQLite website:

 https://system.data.sqlite.org/

Further information about this release can be seen at:

 https://system.data.sqlite.org/index.html/doc/trunk/www/news.wiki

Please post on the SQLite mailing list (sqlite-users at
mailinglists.sqlite.org) if you encounter any problems with this release.

--
Joe Mistachkin

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Thomas Kurz
> SQLite has had geospatial support for years via the r-tree extension, and 
> more recently GeoJSON.

But not compatible to standards from Open Geospatial Consortium, as far as I 
know. Which requires additional conversions, and considering that geodata 
usually have sizes of a GB or more, this is not an option at all.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Keith Medcalf

On Sunday, 27 October, 2019 13:30, Richard Damon  
wrote:

>On 10/27/19 10:24 AM, Simon Slavin wrote:

>> On 27 Oct 2019, at 9:12am, Thomas Kurz  wrote:

>>> the discussion standard deviation has shown that at least STDEV and
>>> POWER would be very helpful if they part of SQLite core.

>> These are presentation issues.  Not database issues.   The results of
>> such calculations are unlikely to be used to decide on the continuation
>> of a SQL statement.

> Doing a group by summary query to get the mean and stdev of a sample in
> the database, and then using those to construct a where clause to
> get/exclude data points that are outliers would seem to be a very
> reasonable sort of query that is really a database issue. This could
> normally be done as two distinct queries (maybe within a read
> transaction so we get consistent data) and the stdev and mean built by
> building a count of records, and the sum of the value and value**2, and
> computing the mean and stdev from that in code, as was shown earlier in
> the thread, but having STDEV as a built-in summary function could be
> useful.

Defining an aggregate function that computes the standard deviation (whether of 
the sample or the population) depending of the values fed into the aggregate is 
rather trivial to do, if you need to do that.  That there are some programming 
languages and platforms that do not permit you to do this "easily" speaks more 
to the deficiencies of those programming languages or platforms than it does to 
SQLite3.  You still have to decide whether you want to do it with inaccurate 
arithmetic methods that may be compatible with how window functions work, or 
using more accurate numerical methods that are not compatible with how window 
functions work, or perhaps by implementing something that is both at the same 
time (and uses more memory and cpu (complexity) in order to allow utilization 
of better numerical methods yet still maintain window function compatibity).

Or you could simply execute a query to get the samples you want and do the 
calculation in your code, then pass that "answer" back as a parameter to your 
subsequent SQL that needs that value.  The need to write "one long monster SQL 
statement to do everything all at once" is usually adopted because one is used 
to using client/server systems where such things take excessive amounts of time 
due to client/server turnarounds.

The point is that how you accomplish this is up to you, and however you decide 
to do it is up to you and your requirements.  If SQLite3 had a built-in stdev 
aggregate/window function, you can be sure that someone would determine that 
the built-in was insufficient or overkill to their requirement and would end up 
overriding that function with one which they, in their sole discretion, thought 
was a better fit to their requirement.  This would end up bloating the size of 
the engine for no gain -- or adding yet another configuration parameter to 
exclude the extraneous built-in.  Multiply this by the thousands of possible 
functions and aggregates and you get an idea of the complexity involved.

As an example, I always compile in a proper version of the ROUND() function 
because the builtin function does elementary school 4/5 rounding and not 
half-even rounding.  Even the builtin SUM and TOTAL are rather simplistic 
implementations that can be wildly inaccurate in certain circumstances.  
However, it appears that most "coders" only graduated elementary school and 
would be confused by proper rounding, just as they do not know how binary 
digital computers work and are thus confused by floating-point numbers not 
being the same as they learned in elementary school.

I also always override a pile of the builtin datetime functions to overcome 
some often encountered shortcomings there as well (though that one does require 
a minor modification to the as-distributed code to add some more information 
that would otherwise be lost to the internal DateTime struct, and to the 
parsing routines in order to maintain that extra data).

This can be done because SQLite3 has chosen extensibility over "Do as I have 
Ordained, and If You Do Not Like That Which I Have Ordained, then Too Bad For 
You" route used by most other RDBMS engines.  And even there, on the few other 
RDBMS where such overrides of internal functionality or addition of "extra" 
functionality is available, it is almost always far more complicated (almost, 
becase there are a few exceptions), and often with poorer results overall (such 
as added complication simply for complication sake or slower performance or, in 
many cases, both).

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Simon Slavin
On 27 Oct 2019, at 7:43pm, Richard Damon  wrote:

> My one thought about fuller ALTER TABLE support would be that since
> SQLite stores the schema as the simple CREATE TABLE command, that ALTER
> TABLE needs to know how to edit that string to match the changes, where
> simple adds are much easier than other forms of editing.

There are two ways to do it.  One is to try to edit the original CREATE TABLE 
command.  The other is to make the change, then write a new CREATE TABLE 
command from the internal record SQLite maintains of the columns of that table.

The second method would be easier to make crash-free, and far easier to debug.  
It would lose formatting and comments from the original command.  But I'm cool 
with that.

The other problem is deleted data.  Suppose you had a table with five columns 
and someone dropped the third column.  First, you need to check that your 
entire schema, including triggers, to make sure nothing refers to that column.  
I don't know how difficult that would be given how SQLite works.

Then you need to deal with the values that had been stored in that column.  
Again, there are two ways to handle it.  One is to rewrite each row of the 
table without the third value, which would be slow and incorporate a ton of 
I/O.  The other is to just note that column 3 must be ignored.  (Change its 
name and give it affinity IGNORE_ME ?)  That would be extremely fast.  If the 
programmer cares about filespace they can do their own VACUUM at a convenient 
time.

Other ALTER TABLE commands have their own problems.  I don't see anything that 
can't be done, but some of them might be difficult.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Richard Damon
On 10/26/19 8:07 PM, Thomas Kurz wrote:
>> Feel free to make suggestions. Which missing feature or features causes 
> you the most bother?
>
> Thanks, Dan.
>
> To me, the most puzzling thing is the lack of full ALTER TABLE support (DROP 
> COLUMN, MODIFY COLUMN, ADD CONSTRAINT, DROP CONSTRAINT). Modifying tables is 
> some kind of science in SQLite, and thus, very error-prone. I'd be willing to 
> donate for that but as a private user I cannot affort 85 k$ ;-)
>
My one thought about fuller ALTER TABLE support would be that since
SQLite stores the schema as the simple CREATE TABLE command, that ALTER
TABLE needs to know how to edit that string to match the changes, where
simple adds are much easier than other forms of editing.

Also, due to the way SQLite handles rows, adding columns will
automatically assume default values for all existing rows. Anything much
more complicated  could easily require that SQLite update the whole
table row by row, and that likely would require making sure that nothing
else was looking at the database, as it might not be possible to keep
the view consistent.

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Richard Damon
On 10/27/19 10:24 AM, Simon Slavin wrote:
> On 27 Oct 2019, at 9:12am, Thomas Kurz  wrote:
>
>> the discussion standard deviation has shown that at least STDEV and POWER 
>> would be very helpful if they part of SQLite core.
> These are presentation issues.  Not database issues.   The results of such 
> calculations are unlikely to be used to decide on the continuation of a SQL 
> statement.

Doing a group by summary query to get the mean and stdev of a sample in
the database, and then using those to construct a where clause to
get/exclude data points that are outliers would seem to be a very
reasonable sort of query that is really a database issue. This could
normally be done as two distinct queries (maybe within a read
transaction so we get consistent data) and the stdev and mean built by
building a count of records, and the sum of the value and value**2, and
computing the mean and stdev from that in code, as was shown earlier in
the thread, but having STDEV as a built-in summary function could be useful.

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Stream loading SQL script

2019-10-27 Thread x
Thanks Keith. I thought it was a documented function and couldn’t find anything 
in the documentation.








From: sqlite-users  on behalf of 
Keith Medcalf 
Sent: Sunday, October 27, 2019 4:46:17 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Stream loading SQL script


On Sunday, 27 October, 2019 07:40, x  wrote:

>Where is this function Keith? I can find any information on it?

The SQLite3 command line shell (shell.c) reads input from stdin or other file 
and processes the commands one at a time by either calling the appropriate 
sqlite3 functions and displaying the output, or handling them internally.  
process_input is the function within the shell.c code which reads the input and 
decides when sufficient input has been collected to constitute a complete 
command for processing.  It is internal to shell.c and not part of any external 
documented API so the only documentation is to read the code.  If you want to 
know how the command line shell converts between a stream of input characters 
and recognizes when a complete command line has accumulated, this is the 
internal function that does that.

>
>From: sqlite-users  on
>behalf of František Kučera 
>Sent: Saturday, October 26, 2019 4:49:26 PM
>To: sqlite-users@mailinglists.sqlite.org us...@mailinglists.sqlite.org>
>Subject: Re: [sqlite] Stream loading SQL script
>
>Dne 25. 10. 19 v 21:41 Keith Medcalf napsal(a):
>> The sqlite3 command line shell already does this.  see function
>process_input
>
>Thanks, it helped.
>
>I see that it checks whether the input contains a semicolon and only
>then it calls sqlite3_complete(). So I implemented it in a similar way
>in C++.
>
>Franta

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread František Kučera
Dne 27. 10. 19 v 15:24 Simon Slavin napsal(a):
> And I agree with you on RIGHT JOIN: it may duplicate what can be done with 
> LEFT JOIN but many SQL facilities are duplicates.  It's in SQL92 and people 
> expect to see it.

+1 This would increase portability of SQL scripts – it would be easier
to have a single set SQL queries that run on both SQLite and e.g.
PostgreSQL.

Franta

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap? - modular design

2019-10-27 Thread František Kučera
Dne 27. 10. 19 v 10:12 Thomas Kurz napsal(a):
> at least some basic math would be very helpful as well. I don't want to 
> suggest a complete math support, that would really be far away from liteness

Generally speaking, it might be difficult to say where is the line
between „light“ and „heavy“. But this issue might be resolved by modular
design (and SQLite already support modules). There might be an extension
full of mathematical functions. It might be „too heavy“ for someone, but
it would not be a problem, because you would be able to use just the
light core without this module and enjoy less complexity.

Franta


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Stream loading SQL script

2019-10-27 Thread Keith Medcalf

On Sunday, 27 October, 2019 07:40, x  wrote:

>Where is this function Keith? I can find any information on it?

The SQLite3 command line shell (shell.c) reads input from stdin or other file 
and processes the commands one at a time by either calling the appropriate 
sqlite3 functions and displaying the output, or handling them internally.  
process_input is the function within the shell.c code which reads the input and 
decides when sufficient input has been collected to constitute a complete 
command for processing.  It is internal to shell.c and not part of any external 
documented API so the only documentation is to read the code.  If you want to 
know how the command line shell converts between a stream of input characters 
and recognizes when a complete command line has accumulated, this is the 
internal function that does that.

>
>From: sqlite-users  on
>behalf of František Kučera 
>Sent: Saturday, October 26, 2019 4:49:26 PM
>To: sqlite-users@mailinglists.sqlite.org us...@mailinglists.sqlite.org>
>Subject: Re: [sqlite] Stream loading SQL script
>
>Dne 25. 10. 19 v 21:41 Keith Medcalf napsal(a):
>> The sqlite3 command line shell already does this.  see function
>process_input
>
>Thanks, it helped.
>
>I see that it checks whether the input contains a semicolon and only
>then it calls sqlite3_complete(). So I implemented it in a similar way
>in C++.
>
>Franta

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Jean-Christophe Deschamps

Hi Simon,

Especially since it can't do something more fundamental than STDEV: 
return all surnames starting with the Unicode character 'Å'.


Reconsider as this works fine.

JcD 


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Simon Slavin
On 27 Oct 2019, at 9:12am, Thomas Kurz  wrote:

> the discussion standard deviation has shown that at least STDEV and POWER 
> would be very helpful if they part of SQLite core.

These are presentation issues.  Not database issues.   The results of such 
calculations are unlikely to be used to decide on the continuation of a SQL 
statement.  Yes, "WHERE a < (b POWER c)" exists, but how often would you need 
it in real life ?

SQLite is a database system.  We shouldn't start to extend it to a presentation 
layer.  Especially since it can't do something more fundamental than STDEV: 
return all surnames starting with the Unicode character 'Å'.

I'm not entirely against adding facilities to SQLite.  But I feel that they 
should be database things, not presentation things.  For example, I think 
supporting more ALTER TABLE would be worthwhile.  And I agree with you on RIGHT 
JOIN: it may duplicate what can be done with LEFT JOIN but many SQL facilities 
are duplicates.  It's in SQL92 and people expect to see it.  I am sometimes 
also in favour of adding a BOOLEAN type, even though it does not appear in 
SQL92.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Stream loading SQL script

2019-10-27 Thread x
Where is this function Keith? I can find any information on it?








From: sqlite-users  on behalf of 
František Kučera 
Sent: Saturday, October 26, 2019 4:49:26 PM
To: sqlite-users@mailinglists.sqlite.org 
Subject: Re: [sqlite] Stream loading SQL script

Dne 25. 10. 19 v 21:41 Keith Medcalf napsal(a):
> The sqlite3 command line shell already does this.  see function process_input

Thanks, it helped.

I see that it checks whether the input contains a semicolon and only
then it calls sqlite3_complete(). So I implemented it in a similar way
in C++.

Franta


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Thomas Kurz
> What do you mean by "SQL-basics"?

I forgot to mention that at least some basic math would be very helpful as 
well. I don't want to suggest a complete math support, that would really be far 
away from liteness, but the discussion standard deviation has shown that at 
least STDEV and POWER would be very helpful if they part of SQLite core.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Thomas Kurz
> Omitting RIGHT JOIN is good, that's a misfeature and LEFT JOIN does 
> everything 
useful it does.

With all dear respect, but I don't think that it is up to you to define what a 
"feature" and a "misfeature" is. iirc, RIGHT JOIN is declared in SQL92, it is 
part of the SQL standard, and therefore it is one of the "SQL basics" I 
mentioned. And it's not a big thing either.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Darren Duncan

On 2019-10-26 4:38 p.m., Thomas Kurz wrote:

The features you name don't take away from the "liteness", they are all quite

small and useful.

Yes of course they are useful, I wouldn't deny that. But they are prioritized 
over SQL-basics, that's what I'm confused about.


What do you mean by "SQL-basics"?

If you mean the list from your next post, there's very little basic about those 
and many are quite complicated.


I agree with adding more ALTER TABLE options but that's about it.

Omitting RIGHT JOIN is good, that's a misfeature and LEFT JOIN does everything 
useful it does.


Omitting SHOW TABLES or similar MySQL-only things is good, those are 
misfeatures, and querying INFORMATION_SCHEMA does everything better and in a 
more standard and composable way.


Anything to do with temporal and spatial is actually quite complicated, both 
data types and constraints, and omitting those is quite reasonable.


-- Darren Duncan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Roadmap?

2019-10-27 Thread Warren Young
On Oct 26, 2019, at 6:28 PM, J. King  wrote:
> 
> a good designer will choose a good schema from the start and thus rarely need 
> to change it

When you add new features to a long-lived application, you may occasionally 
have to ALTER the table(s) the app uses to allow the new feature.  My company’s 
primary product has seen roughly two such DB alterations per year over its 
history, on average.

(The alterations are probably slowing down in truth, as we asymptotically 
approach perfection.  Hah!)

SQLite has you covered when the alteration is just to add a new column or 
rename an existing one, but that’s not every case I’ve run into over my time 
writing DB-based software:

1. We added some DB columns in version N of our software to support new 
features required by clients, but then something like 5 years later, that 
technology dropped out of current use, so we dropped the feature and thus 
dropped those supporting DB columns.

2. We extended an existing feature of the software to allow new functionality, 
requiring DB table additions, then several major versions later we replaced 
that sidecar’d feature with a wholly new tech stack, including its own DB 
tables, so we dropped those intermediate-form DB columns after migrating the 
data.

3. An existing DB table had to have a column added to make the PRIMARY KEY 
unique again after we added a feature that would have allowed that column to 
hold duplicate data; the second column disambiguated the cases.  SQLite lets 
you add the new column, but not change a table’s primary key without creating a 
new table and copying the data.

SQLite avoids some of the need for this with its dynamic typing and its 
unwillingness to enforce length limits.  We have several historical cases where 
we were using another DBMS and needed ALTER TABLE calls not allowed under 
SQLite to change column types or to widen fields.

For example, we had a table that started out using an 8-bit data type to hold a 
TV channel number, since the highest channel number at the time was 125, so 8 
bits was actually one more than we really required.  Then digital TV happened.  
Then IPTV happened.  Then OTT happened.  Now we just have a string column and a 
bunch of logic to figure out whether the string is an analog TV channel number, 
a digital TV channel number, an IP address, a URL…  If we’d been using SQLite 
from the start, we could have just kept using that “TINYINT” column to hold all 
of this, but ick.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users