subject:"\[sqlite\] Sqlite incompatibility with Postgres"

[sqlite] Sqlite incompatibility with Postgres

2016-05-22 Thread dandl

> > Every aggregation function is at least second order: a function that
> > applies a function to the set. So for MIN the function is 'less than',
> > for SUM() the function is 'plus' and so on. In Andl aggregation
> > functions are provided by fold(), which takes a function as an
> > argument.
> 
> I want you to know that you hijacked my Saturday.  I was bothered about
what
> "first order" and "second order" mean, suspecting that we meant different
> things.  After an afternoon with the Oracle of All Knowledge, I think we
were
> talking about different things, and you had a better handle on your end
than
> I did on mine.
> 
> I was concerned that we were treading in territory outside first-order
> predicate logic.  On review, as Wikipedia explains, HOL deals in another
> beast, namely the quantification of sets of sets.
> 
> You were talking about something much simpler, second-order *functions*.
> The input is still a value -- an individual member of a set -- plus
another
> function.  As you say, there are many such in SQL.  In keeping with the
> language's purpose, the primitive components are not exposed, so it's not
> possible to reconstruct min as FOLD(MIN,X). We can do similar things with
> subqueries, e.g.

Yes, agreed. I was indeed talking about second order functions (a function
that takes a function as an argument). This is well down the scale form the
full 'set of sets' and lambda calculus, but extremely useful. Andl has it,
in this one limited form.

> 
>   select sum(N) from (select count(*) as N from T group by a) as A
> 
> One can imagine that restated as
> 
>   select F(sum, count, t) from T
> 
> where F is defined as taking two functions and a value.  I guess that
would
> make F a third-order function.
> 
> APL is instructive in this regard.  What we usually call operators --  + -
x
> ?  -- are termed *functions* in APL, in keeping with their mathematical
> definition.  A function that takes a function is called an operator.  One
> such is "/", the reduction operator; SUM(t) could be expressed as

Yes, both mentally and in writing I naturally think in terms of functions. I
use 'operator' only to comply with TTM, but I think it adds a layer of
confusion.

> 
>   +/t

Just so. APL is a good source of ideas for second order functions.

> 
> > > 2.  Limit, as currently implemented, lies outside the theory because
> > > it doesn't operate on sets.
> >
> > I'll put that one on hold pending a suitable reference or detailed
> > mathematical treatment.
> 
> I think I can accept "first(N)" could be a set function, and if SQL dealt
in
> sets, LIMIT would be a deterministic function.  But SQL deals in bags, and
> with a couple of odd exceptions -- random(), now() -- all its functions
are
> determistic.  LIMIT is not a deterministic function.  I'm not sure what
> happens to first order predicate logic in the face of nondeterminism, but
I'm
> sure it's not good.

I was never arguing about the status of LIMIT wrt SQL as a non-relational
'bag' language. I was only ever defending LIMIT as a well-defined relational
operator, which implies SQL with DISTINCT in force.

> 
> > Sorry. Your nth() is a second order function
> 
> OK.
> 
> > The (single-pass) implementation would maintain a temporary table of
> > rows that are 'minimum so far seen', to a maximum of N. It would be an
> > implementers decision what to do with a row equal to one in that table
> > once N has been reached: add it or ignore it?
> 
> nth() acts on a single column; it keeps the set of N smallest values, as
you
> say.  The answer to your question is "ignore it" because a value equal to
one
> in the set is already a member.  Given the input
> 
> 
>   C {1, 1, 2, 2, 2, 3}
> 
>   min(C) = 1
>   nth(C, 1) = {1}
>   nth(C, 2) = {1, 2}
> 
> I'm not claiming any deep insight, only that nth() would be handy and can
be
> defined mathematically (even if I can't do it).

I don't like NTH() as a name, it's misleading. What is being discussed is a
LOWEST(attribute, N) function with a second argument that is how many. There
is also HIGHEST(attribute, N). It is easily implemented in Andl using FOLD()
and TAKE() [Andl name for LIMIT], but I don't see it as a primitive.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-21 Thread dandl

> Actually, MIN still is fundamentally a first-order itself.  The dyadic
> function call "x min y" returns either x or y depending on how they
compare.
> The list form is then repeated application of the binary min().  This is
> directly comparable to your example of list plus/sum which is repetition
of
> the dyadic "x + y".  

1. An ordered data type is one that implements "less than". Other
comparisons (LE,GT,GE) are then implemented generically (all data types
implement EQ).
2. Yes, MIN(x,y) or x MIN y is a first-order function as described that is
well-defined on any ordered data type, and can also be implemented
generically using LT.
3. [and BTW MAX(x,y) can also be implemented generically on any ordered
type, using LT.]
4. The aggregation function MIN(X) of SQL is a second order function,
equivalent to FOLD(MIN,X). That is, the function repeatedly applies the MIN
function to pairs of values and returns a single value.
5. All aggregation functions (whether defined in SQL or elsewhere) rely on
second order functions of the form FOLD(F,X) where F is the function to be
repeatedly applied. Andl implements them as such, with complete genericity.
6. [COUNT is also a second order function, equivalent to FOLD(+,1).]

> List MIN is NOT a repeated application of "x less than y". --

Agreed, see above.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-21 Thread James K. Lowden

On Fri, 20 May 2016 14:17:25 +1000
"dandl"  wrote:

> Every aggregation function is at least second order: a function that
> applies a function to the set. So for MIN the function is 'less
> than', for SUM() the function is 'plus' and so on. In Andl
> aggregation functions are provided by fold(), which takes a function
> as an argument.

I want you to know that you hijacked my Saturday.  I was bothered about
what "first order" and "second order" mean, suspecting that we meant
different things.  After an afternoon with the Oracle of All Knowledge,
I think we were talking about different things, and you had a better
handle on your end than I did on mine.  

I was concerned that we were treading in territory outside first-order
predicate logic.  On review, as Wikipedia explains, HOL deals in
another beast, namely the quantification of sets of sets.  

You were talking about something much simpler, second-order *functions*.
The input is still a value -- an individual member of a set -- plus
another function.  As you say, there are many such in SQL.  In keeping
with the language's purpose, the primitive components are not exposed,
so it's not possible to reconstruct min as FOLD(MIN,X). We can do
similar things with subqueries, e.g.

select sum(N) from (select count(*) as N from T group by a) as A

One can imagine that restated as 

select F(sum, count, t) from T

where F is defined as taking two functions and a value.  I guess that
would make F a third-order function. 

APL is instructive in this regard.  What we usually call
operators --  + - x ?  -- are termed *functions* in APL, in keeping
with their mathematical definition.  A function that takes a function
is called an operator.  One such is "/", the reduction operator; SUM(t)
could be expressed as 

+/t

> > 2.  Limit, as currently implemented, lies outside the theory
> > because it doesn't operate on sets.
> 
> I'll put that one on hold pending a suitable reference or detailed
> mathematical treatment.

I think I can accept "first(N)" could be a set function, and if SQL
dealt in sets, LIMIT would be a deterministic function.  But SQL deals
in bags, and with a couple of odd exceptions -- random(), now() -- all
its functions are determistic.  LIMIT is not a deterministic
function.  I'm not sure what happens to first order predicate logic in
the face of nondeterminism, but I'm sure it's not good.  

> Sorry. Your nth() is a second order function 

OK. 

> The (single-pass) implementation would maintain a temporary table of
> rows that are 'minimum so far seen', to a maximum of N. It would be an
> implementers decision what to do with a row equal to one in that
> table once N has been reached: add it or ignore it?

nth() acts on a single column; it keeps the set of N smallest values, as
you say.  The answer to your question is "ignore it" because a value
equal to one in the set is already a member.  Given the input

C {1, 1, 2, 2, 2, 3}

min(C) = 1
nth(C, 1) = {1}
nth(C, 2) = {1, 2}

I'm not claiming any deep insight, only that nth() would be handy and
can be defined mathematically (even if I can't do it).  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-20 Thread dandl

> That's an interesting perspective.  If you're dealing with genuine sets,
and
> you define your language in terms of second-order operations, then
something
> like LIMIT could be included.  Would have to be, I guess.

Every aggregation function is at least second order: a function that applies
a function to the set. So for MIN the function is 'less than', for SUM() the
function is 'plus' and so on. In Andl aggregation functions are provided by
fold(), which takes a function as an argument.

> But that's not what SQL is, or what LIMIT is.

Debatable. Many dialects of SQL provide aggregation, which is second order.

> You were rather dismissive of my nth() function, but that approximates
what
> LIMIT does (approximation is all that's possible) with a first-order
> operation.

Sorry. Your nth() is a second order function that can be implemented in Andl
or any language that supports generic aggregation. I would argue that it
suffers from the same problem, depending on definition and implementation.

The (single-pass) implementation would maintain a temporary table of rows
that are 'minimum so far seen', to a maximum of N. It would be an
implementers decision what to do with a row equal to one in that table once
N has been reached: add it or ignore it?

> BTW, I still think you're agreeing with me.  I'm insisting on using the
> "values of the tuple", implicitly restricted to first-order operations.
> Cardinality, as you say, as a second order *function*, hardly a "value".
But
> at least I understand your argument now.
> 
> > To that you can successively add negation, recursion, higher order
> > functions and fixpoint/while. Each of those allows operations that
> > others do not, but there is disagreement about which should be
> > considered 'relational'.
> 
> OK, I see.  It's fitting that the debate is about the definition of the
set
> of relational operators.
> 
> I'm conservative in that regard.  I'm wary of the complexity that higher-
> order operations bring to the language.  Each higher level brings (I
suspect)
> more complexity than the prior, while solving fewer new problems.

True, but for SQL at least the complexity arises from faults in the
language. The aim of Andl is to do more in the database language layer and
less in the application, with a language that makes higher order operations
much easier. For that I think you really need aggregation functions, ordered
functions and while (recursion).

> I think recursion is a good extension, and a good example.  It permits the
> expression of hierarchies.  It's indispensable ... for maybe 1% of
queries.

When you need it you really need it. The main driver is graphs that have
been expressed as relations with self-joins, but it's also needed to
implement an algorithm that is intrinsically an iterated computation eg
Mandelbrot, Sudoku solver.

> I guess you could convince me it makes SQL Turing Complete, but that's a
very
> dense thicket.  Recursive structures are useful.  If they could be
> manipulated without making the language Turing Compiete, I'd consider that
a
> plus.

[A side-note: there are two models of computation: the other is the lambda
calculus; the two are of equivalent power but quite different in
construction. Datalog with negation has the same computational power, deals
better with some kinds of data but not necessarily relations.]

Many writers say the same: a query language should use the lowest language
level possible, but for Andl to reach its goals it must make that power
available and accessible for when needed.

> > Thank you for the reference -- I didn't have that one. I'm familiar
> > with the material.
> 
> You're welcome, and it shows.
> 
> I think we've managed to hash out some agreement:
> 
> 1.  Second order functions are "relational", or can be, depending on one's
> definition.  We have support for them already in SQL.
> 
> 2.  Limit, as currently implemented, lies outside the theory because it
> doesn't operate on sets.

I'll put that one on hold pending a suitable reference or detailed
mathematical treatment.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-19 Thread Darren Duncan

On 2016-05-19 9:17 PM, dandl wrote:
> Every aggregation function is at least second order: a function that applies
> a function to the set. So for MIN the function is 'less than', for SUM() the
> function is 'plus' and so on. In Andl aggregation functions are provided by
> fold(), which takes a function as an argument.

Actually, MIN still is fundamentally a first-order itself.  The dyadic function 
call "x min y" returns either x or y depending on how they compare.  The list 
form is then repeated application of the binary min().  This is directly 
comparable to your example of list plus/sum which is repetition of the dyadic 
"x 
+ y".  List MIN is NOT a repeated application of "x less than y". -- Darren 
Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-19 Thread James K. Lowden

On Thu, 19 May 2016 10:29:48 +1000
"dandl"  wrote:

> > Restriction is applied to the values of the tuple.  The number of
> > tuples is not a value of the tuple.
> 
> No, I can't agree. Restriction is a membership test, a function on
> members: should this tuple be included in the result set or not?
> Cardinality of a set is a second order function on the members of the
> set, obtainable simply by examining all the tuples in the set at the
> same time. There is no a priori reason not to use cardinality in a
> membership functions.

That's an interesting perspective.  If you're dealing with genuine
sets, and you define your language in terms of second-order operations,
then something like LIMIT could be included.  Would have to be, I
guess.  

But that's not what SQL is, or what LIMIT is. 

You were rather dismissive of my nth() function, but that approximates
what LIMIT does (approximation is all that's possible) with a 
first-order operation.  

BTW, I still think you're agreeing with me.  I'm insisting on using the
"values of the tuple", implicitly restricted to first-order
operations.  Cardinality, as you say, as a second order *function*,
hardly a "value".  But at least I understand your argument now.  

> To that you can successively add negation, recursion, higher order
> functions and fixpoint/while. Each of those allows operations that
> others do not, but there is disagreement about which should be
> considered 'relational'. 

OK, I see.  It's fitting that the debate is about the definition of the
set of relational operators.  

I'm conservative in that regard.  I'm wary of the complexity that
higher-order operations bring to the language.  Each higher level
brings (I suspect) more complexity than the prior, while solving fewer
new problems.  

I think recursion is a good extension, and a good example.  It permits
the expression of hierarchies.  It's indispensable ... for maybe 1% of
queries.  

I guess you could convince me it makes SQL Turing Complete, but that's
a very dense thicket.  Recursive structures are useful.  If they could
be manipulated without making the language Turing Compiete, I'd
consider that a plus.  

> Thank you for the reference -- I didn't have that one. I'm familiar
> with the material.

You're welcome, and it shows.  

I think we've managed to hash out some agreement:

1.  Second order functions are "relational", or can be, depending on
one's definition.  We have support for them already in SQL.  

2.  Limit, as currently implemented, lies outside the theory because it
doesn't operate on sets.  

Regards, 

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-19 Thread dandl

> Restriction is applied to the values of the tuple.  The number of tuples
is
> not a value of the tuple.

No, I can't agree. Restriction is a membership test, a function on members:
should this tuple be included in the result set or not? Cardinality of a set
is a second order function on the members of the set, obtainable simply by
examining all the tuples in the set at the same time. There is no a priori
reason not to use cardinality in a membership functions.

> Neither of us is stupid, David.  I've boiled this down to something very
> simple.  If you look at it algebraically, I think you'll come to the same
> conclusion I have.

AS far as I'm concerned I have already done so, but it seems we reach
different conclusions. To settle that we either need new facts or a higher
authority.

> I wouldn't persist except that you're worth convincing.  Andl holds
promise,
> and seeks higher ground than SQL holds.  Insofar as possible, if I can I
want
> to help you get it right.

Much appreciated. Really.

> > For this query: calculate the average of that set of numbers after
> > excluding the 2 largest and 2 smallest values. Again, a pure set
> > operation.
> >
> > A reasonable solution would be to use two subqueries with ORDER BY
> > ASC/DESC and LIMIT 2, followed by an aggregation.
> 
> Sadly, no.  If we're talking about a "pure set operation", and the set is
{1,
> 1, 2}, the "two smallest" is {1, 2} but LIMIT 2 would yield {1, 1}.

No, that isn't a set, it's a multiset. A set has no duplicates.

> Here again, my putative nth() function *does* give the right answer,
simply
> because it's a function of the values, and not of the number of values.
> 
> > > There's no debate about the relational operators.
> 
> By which I meant: there's no debate about what they do.
> 
> > You might be surprised to learn that there is considerable academic
> > uncertainty as to exactly which operators should be included.
> 
> There's no uncertainty.  Some operators are defined in terms of others.
No
> suprise: even under De Morgan you don't need OR if you have NOT and AND.
The
> redundancy makes the language more expressive.

Not what I meant: that only covers the conjunctive queries. To that you can
successively add negation, recursion, higher order functions and
fixpoint/while. Each of those allows operations that others do not, but
there is disagreement about which should be considered 'relational'. De
Morgan won't help you there.

> LIMIT doesn't belong in this part of the discussion, btw, because it is
not
> defined relationally.

Yes it is. But it does require a second order function.

> > For example, is CTE RECURSIVE relational, or not?
> 
> http://wiki.epfl.ch/provenance2011/documents/foundations%20of%20databases-
> abiteboul-1995.pdf
> 
> Cf. Chapter 14.  Adding recursion changes the language.  It adds power; if
> memory serves permits answering second-order queries.

Thank you for the reference -- I didn't have that one. I'm familiar with the
material.

No, recursion (similar to fixpoint/while) makes the language Turing
Complete. Second order functions are not enough.

> > What about LEAD and LAG in the windowing functions? What about string
> > concatenation as an aggregation operator?
> 
> AFAIK there's no debate about those, either.  They can be expressed in
terms
> of simpler operations, and exist for convenience, such as it is.

This is the same debate as for LIMIT, because they rely on ordering. Indeed
it's possible to construct one from the other with something like this:

SELECT * FROM ( 
  SELECT *, ROW_NUMBER() OVER (ORDER BY name) as row FROM sys.databases 
 ) a WHERE row > 5 and row <= 10

You can't do SQL windowing without some kind of ordering comparison, but the
use of ordering in making a selection does not automatically make the query
non-relationally. The result is still just a set of tuples, no matter how
you choose them.

BTW this is one very useful extension for Sqlite, which is otherwise full
book on the relational hierarchy.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread dandl

> > Consider this set of integers: 1,3,5,7,42,99,83,11,83,83
> >
> > In this case, there is no subset S1 of size 3 that satisfies your
> criterion.  In an SQL query, the set returned by LIMIT 3 would not be
defined
> uniquely.
> 
> What you've both said is essentially the point I was trying to make.
> 
> 1.  If you want a deterministic portable result for all valid invocations
of
> LIMIT, you need to either constrain it to use with a totally ordered set
(it
> would be an error to use it on something with duplicates) in order to
> guarantee the number of rows specified in the LIMIT argument, or you need
to
> possibly return a different number of rows than the LIMIT argument.
> 
> 2.  Otherwise, if exactly the number of specified rows must be returned
> without other restrictions, then the result is possibly indeterminate.

I agree, with one tiny tweak. The SQL standard already notes that certain
queries of this kind are "implementation-dependent". Here is an example.

"If the  does not contain an , or contains
an  that
does not specify the order of the rows completely, then the rows of the
table have an order that is defined
only to the extent that the  specifies an order and is
otherwise implementation-dependent."

So in option 2 the result should be considered "implementation-dependent"
and might be deterministic (based on information that is not part of the
query) or not.

> The options with point 1 are not only deterministic but fully relational.

Absolutely.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread James K. Lowden

On Wed, 18 May 2016 20:29:26 +1000
"dandl"  wrote:

> > 2.  Otherwise, if exactly the number of specified rows must be
> > returned without other restrictions, then the result is possibly
> > indeterminate.
> 
> I agree, with one tiny tweak. The SQL standard already notes that
> certain queries of this kind are "implementation-dependent". Here is
> an example.
> 
> "If the  does not contain an , or
> contains an  that [...]

Anything implementation-dependent depends on the implementation.  Ergo,
it is not mathematicaly defined.  It has no theoretical basis.  

Darren Duncan  wrote:

> The options with point 1 are not only deterministic but fully
> relational.

Darren, I agree one could define LIMIT to be deterministic.  But, as
you know, deterministic does not imply relational.  

The point I've made consistently is that the input to LIMIT is not a
value in the database.  As such, it couldn't be a relational operator,
and it's no surprise it appears nowhere in the literature.  Crowning it
"fully relational" is stuff and nonsense.  

I personally don't see any value in making it deterministic.  The best
use of LIMIT is to control command-line output while inspecting the
data.  Pretty much every other use invites error.  Deterministic error
doesn't look like much of an improvement to me.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread James K. Lowden

On Wed, 18 May 2016 10:41:21 +1000
"dandl"  wrote:

> > You lost me at "subset S of N tuples".  Which relational operator
> > takes N as an argument?
> 
> Restriction determines whether a tuple should be included or not; you
> also need cardinality and less than (for comparing members).

Restriction is applied to the values of the tuple.  The number of
tuples is not a value of the tuple.  

Neither of us is stupid, David.  I've boiled this down to something
very simple.  If you look at it algebraically, I think you'll come to
the same conclusion I have.  

I wouldn't persist except that you're worth convincing.  Andl holds
promise, and seeks higher ground than SQL holds.  Insofar as possible,
if I can I want to help you get it right.  

> For this query: calculate the average of that set of numbers after
> excluding the 2 largest and 2 smallest values. Again, a pure set
> operation.
> 
> A reasonable solution would be to use two subqueries with ORDER BY
> ASC/DESC and LIMIT 2, followed by an aggregation. 

Sadly, no.  If we're talking about a "pure set operation", and the set
is {1, 1, 2}, the "two smallest" is {1, 2} but LIMIT 2 would 
yield {1, 1}.  

Here again, my putative nth() function *does* give the right answer,
simply because it's a function of the values, and not of the number of
values.  

> > There's no debate about the relational operators.

By which I meant: there's no debate about what they do.  

> You might be surprised to learn that there is considerable academic
> uncertainty as to exactly which operators should be included. 

There's no uncertainty.  Some operators are defined in terms of
others.  No suprise: even under De Morgan you don't need OR if you have
NOT and AND.  The redundancy makes the language more expressive.  

LIMIT doesn't belong in this part of the discussion, btw, because it is
not defined relationally.  

> For example, is CTE RECURSIVE relational, or not? 

http://wiki.epfl.ch/provenance2011/documents/foundations%20of%20databases-abiteboul-1995.pdf

Cf. Chapter 14.  Adding recursion changes the language.  It adds power;
if memory serves permits answering second-order queries.  

> What about LEAD and LAG in the windowing functions? What about string
> concatenation as an aggregation operator?

AFAIK there's no debate about those, either.  They can be expressed in
terms of simpler operations, and exist for convenience, such as it is.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread dandl

> Or we'll answer my original question by breaking down one of the above two
> options.  The documentation for the implementation may simply say that the
> order will be consistent in any one database connection, without ever
saying
> what the order will be.

This is perfectly consistent with Sqlite behaviour in other comparable
situations, and consistent with what the standard says about ORDER BY:
"implementation defined".

> Numerous users of SQLite have assumed this over the years, since if you
> cannot make this assumption you cannot implement cursors or scrolling
windows
> the way they want to, by changing an OFFSET (or remembering the key values
> for the first and last lines) as the user presses line-up, line-down,
page-up
> or page-down.  It's a natural use of SQLite inside any device with a small
> display and I'm sure programmers would be very annoyed if it was difficult
to
> program.  SQLite does what they want even though there's no documentation
> that says it'll work.

Absolutely! Application programmers depend heavily on SQL to do the heavy
lifting in paging, and all the databases I use support it. Andl supports it
in a way that is entirely consistent with relational theory, but many other
purist relational projects have decided not to. [The application programmer
is of course free to further sort the data locally if desired.]

My only point for the original post was that when you are dealing with an
SQL construct that is not defined by the standard, it's helpful if product
maintainers make some attempt to informally implement a common subset;. In
this case Sqlite is mostly compatible with Postgres, but not for LIMIT -1 vs
LIMIT ALL.

> SQLite does break the consistency rule under one circumstance, though it's
> very unlikely that a programmer would trigger it by accident.  It happens
> when you change the setting for
> 
> PRAGMA reverse_unordered_selects = boolean

If you do that presumably you know to expect what you get!

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread Stefan Evert

> On 18 May 2016, at 02:41, dandl  wrote:
> 
> Then you are mistaken. 
> 1. Consider the following set S of integers: 1,3,5,7,42,99,83,11.
> 2. Divide it into two subsets such that S1 is of size 3 and all members of
> S1 are larger than those in S2.
> 
> A sort is unnecessary -- there are many algorithms that can do that purely
> based on set logic, the ability to compare members and the ability to
> determine the cardinality of a set.

I think the point is that your query may not have a solution if there are ties, 
i.e. the ordering is not complete:

Consider this set of integers: 1,3,5,7,42,99,83,11,83,83

In this case, there is no subset S1 of size 3 that satisfies your criterion.  
In an SQL query, the set returned by LIMIT 3 would not be defined uniquely.

Best,
Stefan Evert

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread dandl

> The "problem" is to produce 3 rows where, relationally, the only answers
have
> 2 or 4 rows.  There is no right answer to the problem because there is no
> answer to the problem.

Which is what I said. The solution with 3 rows is unambiguous. You either
resolve this the way the standard does by making it "implementation defined"
or by forcing the query to be unambiguous by adding all the ORDER BY columns
to the SELECT list.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread dandl

> I am quite certain nevertheless that LIMIT has no relational basis.
> Nothing based on Order By could.  

Then you are mistaken. 
1. Consider the following set S of integers: 1,3,5,7,42,99,83,11.
2. Divide it into two subsets such that S1 is of size 3 and all members of
S1 are larger than those in S2.

A sort is unnecessary -- there are many algorithms that can do that purely
based on set logic, the ability to compare members and the ability to
determine the cardinality of a set.

Another query: calculate the average of that set of numbers after excluding
the 2 largest and 2 smallest values. Again, a pure set operation.

> You lost me at "subset S of N tuples".  Which relational operator takes N
as
> an argument?

Restriction determines whether a tuple should be included or not; you also
need cardinality and less than (for comparing members).

> > Not so. In standard SQL ORDER BY establishes a comparison function
> > between tuples and is part of the DECLARE CURSOR syntax, but the
> > cursor exists regardless.
> 
> Regarding Order By and cursors, I'm referencing CJ Date.  The reason Order
By
> cannot appear in a subquery is that its output is not a table, but a
cursor.
> Whether Order By "establishes a comparison function between tuples" is
> irrelevant; so too does Where.  It's the product that's different.

He's correct as far as that goes. But curiously, ORDER BY LIMIT N could
appear in a subquery because it merely selects a subset -- the actual
ordering is irrelevant.

For this query: calculate the average of that set of numbers after excluding
the 2 largest and 2 smallest values. Again, a pure set operation.

A reasonable solution would be to use two subqueries with ORDER BY ASC/DESC
and LIMIT 2, followed by an aggregation. I don't know if any dialect of SQL
would allow that, but it's relationally valid. BTW Andl does allow it.

> If you accept that Order By can appear only in the outermost query, it's
> clear that it has no relational role at all.  It affects only the order in
> which the rows are returned to the caller.  Other than syntax, how is that
> different from a cursor?

As defined in the SQL standard ORDER BY can only appear that's true, but the
standard does not include LIMIT.

> So it's just a tiny communication optimization?  After all, compared to a
> network round trip, sorting the result (in order to apply LIMIT
> rationally) is usually far more expensive.  I bet no study has ever shown
> LIMIT to improve performance measurably, not that that would justify its
> existence.

[This is a sidetrack but no, in most cases network round-trip is important
enough to warrant a solution, although LIMIT is not the only solution.]

> Then I think you mean you agree!  Because LIMIT is nonrelational, it's
> *undefined*.  We have a long thread here that might be titled "what should
> LIMIT do?"  There's no debate about the relational operators.

You might be surprised to learn that there is considerable academic
uncertainty as to exactly which operators should be included. For example,
is CTE RECURSIVE relational, or not? What about LEAD and LAG in the
windowing functions? What about string concatenation as an aggregation
operator?

There's lots more down this particular wormhole, if you want to pursue it.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread Darren Duncan

On 2016-05-18 2:19 AM, Stefan Evert wrote:
>> On 18 May 2016, at 02:41, dandl  wrote:
>>
>> Then you are mistaken.
>> 1. Consider the following set S of integers: 1,3,5,7,42,99,83,11.
>> 2. Divide it into two subsets such that S1 is of size 3 and all members of
>> S1 are larger than those in S2.
>>
>> A sort is unnecessary -- there are many algorithms that can do that purely
>> based on set logic, the ability to compare members and the ability to
>> determine the cardinality of a set.
>
> I think the point is that your query may not have a solution if there are 
> ties, i.e. the ordering is not complete:
>
> Consider this set of integers: 1,3,5,7,42,99,83,11,83,83
>
> In this case, there is no subset S1 of size 3 that satisfies your criterion.  
> In an SQL query, the set returned by LIMIT 3 would not be defined uniquely.

What you've both said is essentially the point I was trying to make.

1.  If you want a deterministic portable result for all valid invocations of 
LIMIT, you need to either constrain it to use with a totally ordered set (it 
would be an error to use it on something with duplicates) in order to guarantee 
the number of rows specified in the LIMIT argument, or you need to possibly 
return a different number of rows than the LIMIT argument.

2.  Otherwise, if exactly the number of specified rows must be returned without 
other restrictions, then the result is possibly indeterminate.

The options with point 1 are not only deterministic but fully relational.

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-18 Thread Simon Slavin

On 18 May 2016, at 2:05am, dandl  wrote:

>> The "problem" is to produce 3 rows where, relationally, the only answers
> have
>> 2 or 4 rows.  There is no right answer to the problem because there is no
>> answer to the problem.
> 
> Which is what I said. The solution with 3 rows is unambiguous. You either
> resolve this the way the standard does by making it "implementation defined"
> or by forcing the query to be unambiguous by adding all the ORDER BY columns
> to the SELECT list.

Or we'll answer my original question by breaking down one of the above two 
options.  The documentation for the implementation may simply say that the 
order will be consistent in any one database connection, without ever saying 
what the order will be.

Numerous users of SQLite have assumed this over the years, since if you cannot 
make this assumption you cannot implement cursors or scrolling windows the way 
they want to, by changing an OFFSET (or remembering the key values for the 
first and last lines) as the user presses line-up, line-down, page-up or 
page-down.  It's a natural use of SQLite inside any device with a small display 
and I'm sure programmers would be very annoyed if it was difficult to program.  
SQLite does what they want even though there's no documentation that says it'll 
work.

SQLite does break the consistency rule under one circumstance, though it's very 
unlikely that a programmer would trigger it by accident.  It happens when you 
change the setting for

PRAGMA reverse_unordered_selects = boolean

Simon.

[sqlite] Sqlite incompatibility with Postgres

2016-05-17 Thread James K. Lowden

On Tue, 17 May 2016 11:09:53 +1000
"dandl"  wrote:

> Any disagreement so far?

Full agreement; your description is perfectly sound.  

I am quite certain nevertheless that LIMIT has no relational basis.
Nothing based on Order By could.  And I'll try to clear up what I meant
by a cursor.  

> So the "3" is a perfectly valid argument for a set-oriented theory:
> find a subset S of N tuples with the following test for set
> membership: that each member of S is greater than each member not in
> S when compared by certain attributes, for N = 3. Pure set logic with
> a membership function.

You lost me at "subset S of N tuples".  Which relational operator takes
N as an argument?  

You could be right vis a vis set theory.  But strictly within
relational theory, I'll cede your point when you demonstrate it
relationally.  The N in "N tuples" is not to be found in the relation's
extension. Even if we include aggregation, all relational functions
operate on the *values* of the set, or functions of the values of the
set (e.g. min() or avg()).  N is not among them.  

> > "Order by 1" is always valid.
> 
> By analogy, not because they're the same. In order to apply LIMIT 3
> the query parser should require a test of set membership that is fully
> determined for every member. It can do that by either requiring all
> select list columns to appear in the ORDER BY, or by applying other
> constraints such as a unique key. 

Unless your point is constrained to the LIMIT operator, you're making a
mistake here.  Normally the Select list is a superset of the Order By
list.  If they need to be equal for LIMIT to work, that's just one more
nail in LIMIT's coffin, a byproduct of ts nonrelationality.  

Given a table T {A,B}, what's wrong with "select A from T
order by B" (even though that's not valid SQL)? The system has access
to the full table, can sort by B and project A. Nothing to do with
indexes or constraints.  The query is invalid not because it couldn't
be executed, but because it's undefined: there's no "sort" operator to
apply before "project".  Order By, not being a relational function, is
a filter applied to the relational product.  

( Yes, "filter"; filters don't alway remove: 
$ echo hello | rev
olleh
)

All members of the Order By list must appear in the Select list because
the Select list is input to Order By, see next.  

> > > Order By just produces a cursor for convenient traversal of the
> > > results.
> 
> Not so. In standard SQL ORDER BY establishes a comparison function
> between tuples and is part of the DECLARE CURSOR syntax, but the
> cursor exists regardless.

Regarding Order By and cursors, I'm referencing CJ Date.  The reason
Order By cannot appear in a subquery is that its output is not a table,
but a cursor.  Whether Order By "establishes a comparison function
between tuples" is irrelevant; so too does Where.  It's the product
that's different.  

If you accept that Order By can appear only in the outermost query,
it's clear that it has no relational role at all.  It affects only the
order in which the rows are returned to the caller.  Other than syntax,
how is that different from a cursor?  

> The point of LIMIT is that it is a complete query; the rows can
> be returned in a single network round trip; the result set can be
> discarded.

So it's just a tiny communication optimization?  After all, compared to
a network round trip, sorting the result (in order to apply LIMIT
rationally) is usually far more expensive.  I bet no study has ever
shown LIMIT to improve performance measurably, not that that would
justify its existence.  

> > LIMIT causes harm.  Bad results come of bad math.
> 
> Disagree. The problem (if there is one) is that it is not
> well-defined.

Then I think you mean you agree!  Because LIMIT is nonrelational, it's
*undefined*.  We have a long thread here that might be titled "what
should LIMIT do?"  There's no debate about the relational operators.
It's only the ad hoc add-ons that present problems, precisely because
they lie outside the theory and provide (unwittingly) idiosyncratic
behavior.  Like NULL, LIMIT is unspecified by the theory.  Like NULL,
LIMIT is hard to get "right" because each implementation has to make
its own decision about what it means, instead of relying on the math.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-17 Thread James K. Lowden

On Tue, 17 May 2016 11:09:53 +1000
"dandl"  wrote:

> > I'll invent here and now to replace LIMIT:  nth().  
> 
> The issue is find the "top N". This does not solve the problem.

nth() does find "top N".  For any query, nth(c, N) returns N rows.  It
also exposes the arbitrariness of LIMIT.  To use nth() correctly for
Simon's data

select * from T as out where exists ( 
select 1 from T
where out.first = first and out.second = second
group by second
having nth(second, 2) <= out.second
);

produces 4 rows.  Deterministically.  Unambiguously.  

The "problem" is to produce 3 rows where, relationally, the only
answers have 2 or 4 rows.  There is no right answer to the problem
because there is no answer to the problem.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-17 Thread dandl

> > > > first   second
> > > > -   --
> > > > MarkSpark
> > > > Emily   Spark
> > > > MarySoper
> > > > Brian   Soper
> > > >
> > > > SELECT first,second FROM members ORDER BY second LIMIT 3
> 
> First, hat tip to Simon for providing a motivating example.  :-)
> 
> The question illustrates what I mean when I say Limit is not "rooted in
the
> data": in this case, "3" is not in the data, and is not a function of the
> data.  Having introduced an extraneous arbitrary element, ambituity and
> contradiction are inevitable.  It's practically the definition of a hack,
> right?  Does the job, albeit incorrectly.

Not so. First: a couple of facts to avoid misunderstanding.

1. Relational theory is a theory of set operations on tuples. Any query that
can be expressed as a set operation is valid.
2. In order to perform the familiar operations of restriction (WHERE) and
join, scalar operations are allowed on values of attributes (columns). Those
operations include:
a) compare equal (all types)
b) compare greater/less than, if the value is of any ordered type
c) expression evaluation, to construct new values of any type.

Any attribute that can be compared greater/less for the purpose of
restriction can also be used in a query that finds the largest (1 or N) or
smallest (1 or N) of that attribute. This is pure relational theory, most of
it already known to Codd back in 1972. Any disagreement so far?

So the "3" is a perfectly valid argument for a set-oriented theory: find a
subset S of N tuples with the following test for set membership: that each
member of S is greater than each member not in S when compared by certain
attributes, for N = 3. Pure set logic with a membership function.

> > I would say that this is an invalid query. As already applies for
> > DISTINCT and GROUP BY, the query parser should require that every
> > column in the column list should appear in the ORDER BY list. If it
> > does not, then the result is indeterminate.
> 
> Order By does not requre Group By, and the Select list is a *superset* of
the
> Order By list.  I'm not sure where you got the notion that the the Select
and
> Order By sets are equal.  "Order by 1" is always valid.

By analogy, not because they're the same. In order to apply LIMIT 3 the
query parser should require a test of set membership that is fully
determined for every member. It can do that by either requiring all select
list columns to appear in the ORDER BY, or by applying other constraints
such as a unique key. If it does not, then the results of the query depend
on information that is not part of the query (ie not deterministic).

> David, permit me to elaborate on my indictment of LIMIT.  You said
> earlier:
> 
> > You can't sort the relation, but you can certainly apply an order when
> > performing a query. How else would MIN() work?
> 
> I'm not disputing that.  Window functions even require multiple sorts in
the
> same query.
> 
> Whether or not "LIMIT is perfectly relational", we do know relational
algebra
> has no Sort operator, and that Order By is never part of an input to a
> relational operation (because of course relatational operands have no
order).
> Order By just produces a cursor for convenient traversal of the results.

Not so. In standard SQL ORDER BY establishes a comparison function between
tuples and is part of the DECLARE CURSOR syntax, but the cursor exists
regardless.

In a query retrieved by an external API there is no requirement for a cursor
to ever exist (it's undefined, and not required by relational theory).

> I'd be perfectly fine with a function I'll invent here and now to replace
> LIMIT:  nth().  It's a generalization of min(); the
> construction nth(C, 1) is equivalent to min(C).   You use it this way:
> 
>   SELECT first,second
>   FROM members
>   where second < nth(second, 2)
> 
> That query is based in the data.  It's unambiguous.  Given Simon's input,
it
> produces 2 rows; with "< 3" it produces 4 rows.  It can be used without
Order
> By (for the same reason min() can).  While it
> *implies* a sort, it doesn't require one (because indexes), as LIMIT does.
> And, like min() and unlike Order By, it can be used in a subquery.

The issue is find the "top N". This does not solve the problem.

> LIMIT is a hack.  It's an "obvious" extension to SQL, so simple it needn't
> even be part of it, because the program reading the rows from the DBMS can
> always stop wherever it wants.  Simple things are always implemented
freely -
> - even if unnecessary or misbegotten, simply because they're easy to do
and
> understand -- and LIMIT was
> no exception.   

I disagree. The point of LIMIT is that it is a complete query; the rows can
be returned in a single network round trip; the result set can be discarded.

Ironically, though, seemingly simple things are very
> hard, sometimes impossible, to explain mathematically.  In that way, LIMIT
> shelters under the same roof as NULL and SQL's use of bags instead of
sets.

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread dandl

> > All true.  But it brings up a question.  Suppose the following:
> >
> > first   second
> > -   --
> > MarkSpark
> > Emily   Spark
> > MarySoper
> > Brian   Soper
> >
> > SELECT first,second FROM members ORDER BY second LIMIT 3
> >
> I think a proper solution for this then is to treat the LIMIT as
approximate
> rather than exact; it indicates a desire rather than a promise.
> 
> In the scenario you describe, the query should return either 2 rows or 4
> rows, so that ALL of the rows whose second field value of "Spark" are, or
are
> not, returned.  Projecting this to there not being an ORDER BY clause,
either
> all rows are returned or zero rows are returned.  Thus the result is
> deterministic.
> 
> Whether returning above or below the limit is done, is a separate thing to
> decide, though I suggest returning above is better.

I would say that this is an invalid query. As already applies for DISTINCT
and GROUP BY, the query parser should require that every column in the
column list should appear in the ORDER BY list. If it does not, then the
result is indeterminate.

Sqlite already permits indeterminate queries, but other SQL engines do not.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread Simon Slavin

On 16 May 2016, at 5:29am, Darren Duncan  wrote:

> On 2016-05-15 12:35 AM, Simon Slavin wrote:
> 
>> firstsecond
>> ---
>> Mark Spark
>> EmilySpark
>> Mary Soper
>> BrianSoper
>> 
>> SELECT first,second FROM members ORDER BY second LIMIT 3
> 
> I think a proper solution for this then is to treat the LIMIT as approximate 
> rather than exact; it indicates a desire rather than a promise.
> 
> In the scenario you describe, the query should return either 2 rows or 4 rows

What ?  For a query on a 4 row table, where I have clearly and explicitly 
requested a limit of 3 rows, it can be correct to return 2 or 4 rows ?  I would 
imagine that many programmers would find that annoying and counter-intuitive.

To the other people who answered using the word 'deterministic', I would like 
them to answer my question: do you /know/ that your SQL engine is going to 
return the same three rows each time ?  Or if no changes are made to the data 
while you execute the following two commands

SELECT first,second FROM members ORDER BY second LIMIT 3;
SELECT first,second FROM members ORDER BY second LIMIT 3 OFFSET 3

(or whatever the syntax is for your favourite implementation) are you sure 
you're going to get no rows twice ?  Is it documented ?

Simon.

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread James K. Lowden

On Mon, 16 May 2016 16:17:35 +1000
"dandl"  wrote:

> > > All true.  But it brings up a question.  Suppose the following:
> > >
> > > first second
> > > - --
> > > Mark  Spark
> > > Emily Spark
> > > Mary  Soper
> > > Brian Soper
> > >
> > > SELECT first,second FROM members ORDER BY second LIMIT 3

First, hat tip to Simon for providing a motivating example.  :-)  

The question illustrates what I mean when I say Limit is not "rooted in
the data": in this case, "3" is not in the data, and is not a function
of the data.  Having introduced an extraneous arbitrary element,
ambituity and contradiction are inevitable.  It's practically the
definition of a hack, right?  Does the job, albeit incorrectly.  

> I would say that this is an invalid query. As already applies for
> DISTINCT and GROUP BY, the query parser should require that every
> column in the column list should appear in the ORDER BY list. If it
> does not, then the result is indeterminate.

Order By does not requre Group By, and the Select list is a *superset*
of the Order By list.  I'm not sure where you got the notion that the
the Select and Order By sets are equal.  "Order by 1" is always valid.  

David, permit me to elaborate on my indictment of LIMIT.  You said
earlier: 

> You can't sort the relation, but you can certainly apply an order when
> performing a query. How else would MIN() work?

I'm not disputing that.  Window functions even require multiple sorts
in the same query.  

Whether or not "LIMIT is perfectly relational", we do know relational
algebra has no Sort operator, and that Order By is never part of an
input to a relational operation (because of course relatational operands
have no order). Order By just produces a cursor for convenient traversal
of the results.  

I'd be perfectly fine with a function I'll invent here and now to
replace LIMIT:  nth().  It's a generalization of min(); the
construction nth(C, 1) is equivalent to min(C).   You use it this way: 

SELECT first,second 
FROM members
where second < nth(second, 2)

That query is based in the data.  It's unambiguous.  Given Simon's
input, it produces 2 rows; with "< 3" it produces 4 rows.  It can be
used without Order By (for the same reason min() can).  While it
*implies* a sort, it doesn't require one (because indexes), as LIMIT
does. And, like min() and unlike Order By, it can be used in a
subquery.  

LIMIT is a hack.  It's an "obvious" extension to SQL, so simple it
needn't even be part of it, because the program reading the rows from
the DBMS can always stop wherever it wants.  Simple things are always
implemented freely -- even if unnecessary or misbegotten, simply
because they're easy to do and understand -- and LIMIT was
no exception.   Ironically, though, seemingly simple things are very
hard, sometimes impossible, to explain mathematically.  In that way,
LIMIT shelters under the same roof as NULL and SQL's use of bags
instead of sets.  

While that's an abstract argument, it's at the root of very practical
problems.  LIMIT is a FAQ on this mailing list.  Given the number of
SQLite programmers, we can bet every day someone uses limit in a
subquery, getting or concealing a nondeterministic result.  Which is to
say: LIMIT causes harm.  Bad results come of bad math.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread R Smith

On 2016/05/16 8:17 AM, dandl wrote:
>> all rows are returned or zero rows are returned.  Thus the result is
>> deterministic.
>>
>> Whether returning above or below the limit is done, is a separate thing to
>> decide, though I suggest returning above is better.
> I would say that this is an invalid query. As already applies for DISTINCT
> and GROUP BY, the query parser should require that every column in the
> column list should appear in the ORDER BY list. If it does not, then the
> result is indeterminate.
>
> Sqlite already permits indeterminate queries, but other SQL engines do not.

I don't think any DB in existence forces you to specify every column (in 
the column list) also in the ORDER BY list? If you call this 
indeterminate, then you cannot say that "other SQL engines" do not allow 
indeterminate queries. They all do.

What they don't allow is GROUP BY queries without aggregate specifiers 
on columns not included in the groupings. I also think that while SQLite 
allows this, it's not an oversight, there is an implicit method to it 
which may not be deterministic per se, but it is the Query maker's 
choice. Sometimes you don't care about the determinism - much like 
sometimes you don't care about the rest of the results when you say 
LIMIT 5. As Simon pointed out - ORDER BY and LIMIT etc already allows us 
to ask things that are not deterministic (in any DB) but not every 
result needs to be deterministic.

If the question is: "Name 5 marine animals", nobody cares which 5 you 
name, or in which order you name them.

It always seemed strange to me that in other DB engines when I KNOW the 
values in column 3 of the query is always going to be "20" in the 
grouping that I group by (or even when it doesn't matter what that value 
is), I can't even get my query to run in MSSQL if I don't go put that 
stupid column also in a MAX() function or remove it - so it forces me to 
do a most useless bit of extra coding so it itself can sleep well at 
night reveling in its own determinism.

It's like a car preventing you to steer out of your lane when you 
haven't switched on your indicator... An accident waiting to happen.

Note: I understand not all drivers are equal, and determinism in DBs is 
the safer choice, and I agree with it (And yes, I'm still one of the 
proponents of adding the "strict" mode in SQLite - please devs) - I just 
find it refreshing that some allow me to express what I really want in 
stead of making me type functions that has no value, and I can't agree 
calling such a feature "wrongful". Especially if the non-determinism of 
it is well-documented.

Most of the outcries I've seen on this forum to do with the above (or 
the ducky typing etc.) comes from people who jumps from using some other 
DB server to SQLite and didn't read the documentation, and then be 
bitten by it not doing things as they expect. They then proceed to offer 
many reasons why they should be justified in "expecting" the DB to do 
whatever they expected (with some merit, I should add), mostly citing 
another big DB engine's manners. All this in lieu of just reading the 
SQLite manual.  If they not going to read the docs, then it doesn't 
really matter which behaviour is chosen, it will surprise someone.

Cheers,
Ryan

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread Keith Medcalf


That is irrelevant.  You said three rows, so three rows should be returned.  
The language should not attempt to implement wishy washy conditions guessing 
what you might have meant.  If you wanted a maximum of three rows containing 
the SAME firstname, then that is what you would have asked for.  SInce that is 
not what you asked for, returning eanything else is an error.

There is no guarantee that they will be the same three rows or that they will 
appear in the same order, if the same query is re-executed.  There never is, 
unless both queries are performed in the same transaction.

> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Darren Duncan
> Sent: Sunday, 15 May, 2016 22:30
> To: SQLite mailing list
> Subject: Re: [sqlite] Sqlite incompatibility with Postgres
> 
> On 2016-05-15 12:35 AM, Simon Slavin wrote:
> > On 15 May 2016, at 6:04am, Darren Duncan 
> wrote:
> >
> >> You seem to be forgetting the fact that LIMIT/OFFSET is not its own
> clause, rather it is an extension to the ORDER BY clause and only has
> meaning within the context of the ORDER BY it is part of.
> >
> > All true.  But it brings up a question.  Suppose the following:
> >
> > first   second
> > -   --
> > MarkSpark
> > Emily   Spark
> > MarySoper
> > Brian   Soper
> >
> > SELECT first,second FROM members ORDER BY second LIMIT 3
> >
> > Without looking up either a standards document for SQL or the
> documentation for your favourite implementations of SQL, answer this
> question:
> >
> > Does the documentation for your favourite implementation of SQL state
> that you'll get the same rows every time you execute the above "SELECT" ?
> 
> I think a proper solution for this then is to treat the LIMIT as
> approximate
> rather than exact; it indicates a desire rather than a promise.
> 
> In the scenario you describe, the query should return either 2 rows or 4
> rows,
> so that ALL of the rows whose second field value of "Spark" are, or are
> not,
> returned.  Projecting this to there not being an ORDER BY clause, either
> all
> rows are returned or zero rows are returned.  Thus the result is
> deterministic.
> 
> Whether returning above or below the limit is done, is a separate thing to
> decide, though I suggest returning above is better.
> 
> -- Darren Duncan
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread Darren Duncan

On 2016-05-15 11:17 PM, dandl wrote:
>>> All true.  But it brings up a question.  Suppose the following:
>>>
>>> first   second
>>> -   --
>>> MarkSpark
>>> Emily   Spark
>>> MarySoper
>>> Brian   Soper
>>>
>>> SELECT first,second FROM members ORDER BY second LIMIT 3
>>>
>> I think a proper solution for this then is to treat the LIMIT as
> approximate
>> rather than exact; it indicates a desire rather than a promise.
>>
>> In the scenario you describe, the query should return either 2 rows or 4
>> rows, so that ALL of the rows whose second field value of "Spark" are, or
> are
>> not, returned.  Projecting this to there not being an ORDER BY clause,
> either
>> all rows are returned or zero rows are returned.  Thus the result is
>> deterministic.
>>
>> Whether returning above or below the limit is done, is a separate thing to
>> decide, though I suggest returning above is better.
>
> I would say that this is an invalid query. As already applies for DISTINCT
> and GROUP BY, the query parser should require that every column in the
> column list should appear in the ORDER BY list. If it does not, then the
> result is indeterminate.
>
> Sqlite already permits indeterminate queries, but other SQL engines do not.

Not EVERY column, just a superkey of the result columns should suffice.
-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-16 Thread Darren Duncan

On 2016-05-15 10:50 PM, J Decker wrote:
> On Sun, May 15, 2016 at 10:02 PM, Darren Duncan  
> wrote:
>> On 2016-05-15 9:56 PM, J Decker wrote:
>>>
>>> On Sun, May 15, 2016 at 9:29 PM, Darren Duncan 
>>> wrote:

 On 2016-05-15 12:35 AM, Simon Slavin wrote:
>
> All true.  But it brings up a question.  Suppose the following:
>
> first   second
> -   --
> MarkSpark
> Emily   Spark
> MarySoper
> Brian   Soper
>
> SELECT first,second FROM members ORDER BY second LIMIT 3
>
> Without looking up either a standards document for SQL or the
> documentation for your favourite implementations of SQL, answer this
> question:
>
> Does the documentation for your favourite implementation of SQL state
> that
> you'll get the same rows every time you execute the above "SELECT" ?

 I think a proper solution for this then is to treat the LIMIT as
 approximate
 rather than exact; it indicates a desire rather than a promise.

 In the scenario you describe, the query should return either 2 rows or 4
 rows, so that ALL of the rows whose second field value of "Spark" are, or
 are not, returned.  Projecting this to there not being an ORDER BY
 clause,
 either all rows are returned or zero rows are returned.  Thus the result
 is
 deterministic.
>>>
>>> even if it did for 'spark' rows (which either case you suggest would
>>> be horrible) 'soper' would still be non-deterministic, and rebuilding
>>> indexes could reorder the results.
>>
>> No, it is still deterministic.
>>
>> The ORDER BY clause specified a partial order of the results, not a total
>> order.
>>
>> What I specified returns only complete groups of rows where within each
>> group the rows are unordered but the groups as a whole are ordered relative
>> to each other.
>>
>> The fact this is deterministic would probably be more clear if the result
>> rows were nested, one outer row per "group" that I mentioned.  But even if
>> not, the caller knew that they were only ordering by second but selecting
>> first, so if they see multiple rows with the same second value, they know
>> that those rows are not sorted between themselves, only that rows with
>> different second values are sorted relative to each other.
>>
>> So fully deterministic.
>
> 'SELECT first,second FROM members ORDER BY second LIMIT 3' (that's
> mysql format right?)
>
> I don't see a full set as a requirement (such that the output would be
> 2 or 4 records and not the 3 I asked for...) .  the query implies 3
> rows, not 3 sets.

I never said interpret it as 3 sets, I said round the number of rows to the 
nearest whole set (in either the up or down direction).

> SELECT first,second FROM members ORDER BY second LIMIT 3,3 (for the
> next 3 lines I'm displaying on a form for instance)
>
> and specifying that the result set includes a first name, the result
> sets taken as a hole are not guaranteed equal (procedurally and in
> practice they may be, but pessimistically...).

If someone is doing pagination, then at least one of these scenarios should be 
true:

1.  The ORDER BY is being done on a superkey, which guarantees a total order.

2.  Row groups of > 1 row that cross a boundary are returned by BOTH queries, 
eg 
"ORDER BY second LIMIT 3 OFFSET 0" and "ORDER BY second LIMIT 3 OFFSET 3" would 
both return both "Spark" rows, thus ensuring there are no holes.

Clearly, option 1 is better, but failing that, having pages that overlap is NOT 
a bad thing and even has precedent (assuming users are made aware pages can 
overlap), such as often happens in say a GUI scrolling list.

On a further note...

While this behavior isn't something easily portable across DBMSs, SQLite can 
give deterministic results for itself, at least for "with rowid" tables, if it 
implicitly sorts by the rowid after the explicit terms, as if the user had 
written "rowid" at the end of the ORDER BY list.  Since the rowid is an actual 
table column exposed to the user and not just a hidden implementation detail, 
this is consistent with the relational model.

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread J Decker

On Sun, May 15, 2016 at 10:02 PM, Darren Duncan  
wrote:
> On 2016-05-15 9:56 PM, J Decker wrote:
>>
>> On Sun, May 15, 2016 at 9:29 PM, Darren Duncan 
>> wrote:
>>>
>>> On 2016-05-15 12:35 AM, Simon Slavin wrote:


 All true.  But it brings up a question.  Suppose the following:

 first   second
 -   --
 MarkSpark
 Emily   Spark
 MarySoper
 Brian   Soper

 SELECT first,second FROM members ORDER BY second LIMIT 3

 Without looking up either a standards document for SQL or the
 documentation for your favourite implementations of SQL, answer this
 question:

 Does the documentation for your favourite implementation of SQL state
 that
 you'll get the same rows every time you execute the above "SELECT" ?
>>>
>>>
>>> I think a proper solution for this then is to treat the LIMIT as
>>> approximate
>>> rather than exact; it indicates a desire rather than a promise.
>>>
>>> In the scenario you describe, the query should return either 2 rows or 4
>>> rows, so that ALL of the rows whose second field value of "Spark" are, or
>>> are not, returned.  Projecting this to there not being an ORDER BY
>>> clause,
>>> either all rows are returned or zero rows are returned.  Thus the result
>>> is
>>> deterministic.
>>
>>
>> even if it did for 'spark' rows (which either case you suggest would
>> be horrible) 'soper' would still be non-deterministic, and rebuilding
>> indexes could reorder the results.
>
>
> No, it is still deterministic.
>
> The ORDER BY clause specified a partial order of the results, not a total
> order.
>
> What I specified returns only complete groups of rows where within each
> group the rows are unordered but the groups as a whole are ordered relative
> to each other.
>
> The fact this is deterministic would probably be more clear if the result
> rows were nested, one outer row per "group" that I mentioned.  But even if
> not, the caller knew that they were only ordering by second but selecting
> first, so if they see multiple rows with the same second value, they know
> that those rows are not sorted between themselves, only that rows with
> different second values are sorted relative to each other.
>
> So fully deterministic.
>

'SELECT first,second FROM members ORDER BY second LIMIT 3' (that's
mysql format right?)

I don't see a full set as a requirement (such that the output would be
2 or 4 records and not the 3 I asked for...) .  the query implies 3
rows, not 3 sets.

SELECT first,second FROM members ORDER BY second LIMIT 3,3 (for the
next 3 lines I'm displaying on a form for instance)


and specifying that the result set includes a first name, the result
sets taken as a hole are not guaranteed equal (procedurally and in
practice they may be, but pessimistically...).



>
> -- Darren Duncan
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread Darren Duncan

On 2016-05-15 9:56 PM, J Decker wrote:
> On Sun, May 15, 2016 at 9:29 PM, Darren Duncan  
> wrote:
>> On 2016-05-15 12:35 AM, Simon Slavin wrote:
>>>
>>> All true.  But it brings up a question.  Suppose the following:
>>>
>>> first   second
>>> -   --
>>> MarkSpark
>>> Emily   Spark
>>> MarySoper
>>> Brian   Soper
>>>
>>> SELECT first,second FROM members ORDER BY second LIMIT 3
>>>
>>> Without looking up either a standards document for SQL or the
>>> documentation for your favourite implementations of SQL, answer this
>>> question:
>>>
>>> Does the documentation for your favourite implementation of SQL state that
>>> you'll get the same rows every time you execute the above "SELECT" ?
>>
>> I think a proper solution for this then is to treat the LIMIT as approximate
>> rather than exact; it indicates a desire rather than a promise.
>>
>> In the scenario you describe, the query should return either 2 rows or 4
>> rows, so that ALL of the rows whose second field value of "Spark" are, or
>> are not, returned.  Projecting this to there not being an ORDER BY clause,
>> either all rows are returned or zero rows are returned.  Thus the result is
>> deterministic.
>
> even if it did for 'spark' rows (which either case you suggest would
> be horrible) 'soper' would still be non-deterministic, and rebuilding
> indexes could reorder the results.

No, it is still deterministic.

The ORDER BY clause specified a partial order of the results, not a total order.

What I specified returns only complete groups of rows where within each group 
the rows are unordered but the groups as a whole are ordered relative to each 
other.

The fact this is deterministic would probably be more clear if the result rows 
were nested, one outer row per "group" that I mentioned.  But even if not, the 
caller knew that they were only ordering by second but selecting first, so if 
they see multiple rows with the same second value, they know that those rows 
are 
not sorted between themselves, only that rows with different second values are 
sorted relative to each other.

So fully deterministic.

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread J Decker

On Sun, May 15, 2016 at 9:29 PM, Darren Duncan  
wrote:
> On 2016-05-15 12:35 AM, Simon Slavin wrote:
>>
>> On 15 May 2016, at 6:04am, Darren Duncan  wrote:
>>
>>> You seem to be forgetting the fact that LIMIT/OFFSET is not its own
>>> clause, rather it is an extension to the ORDER BY clause and only has
>>> meaning within the context of the ORDER BY it is part of.
>>
>>
>> All true.  But it brings up a question.  Suppose the following:
>>
>> first   second
>> -   --
>> MarkSpark
>> Emily   Spark
>> MarySoper
>> Brian   Soper
>>
>> SELECT first,second FROM members ORDER BY second LIMIT 3
>>
>> Without looking up either a standards document for SQL or the
>> documentation for your favourite implementations of SQL, answer this
>> question:
>>
>> Does the documentation for your favourite implementation of SQL state that
>> you'll get the same rows every time you execute the above "SELECT" ?
>
>
> I think a proper solution for this then is to treat the LIMIT as approximate
> rather than exact; it indicates a desire rather than a promise.
>
> In the scenario you describe, the query should return either 2 rows or 4
> rows, so that ALL of the rows whose second field value of "Spark" are, or
> are not, returned.  Projecting this to there not being an ORDER BY clause,
> either all rows are returned or zero rows are returned.  Thus the result is
> deterministic.

even if it did for 'spark' rows (which either case you suggest would
be horrible) 'soper' would still be non-deterministic, and rebuilding
indexes could reorder the results.

>
> Whether returning above or below the limit is done, is a separate thing to
> decide, though I suggest returning above is better.
>
> -- Darren Duncan
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread Darren Duncan

On 2016-05-15 12:35 AM, Simon Slavin wrote:
> On 15 May 2016, at 6:04am, Darren Duncan  wrote:
>
>> You seem to be forgetting the fact that LIMIT/OFFSET is not its own clause, 
>> rather it is an extension to the ORDER BY clause and only has meaning within 
>> the context of the ORDER BY it is part of.
>
> All true.  But it brings up a question.  Suppose the following:
>
> first second
> - --
> Mark  Spark
> Emily Spark
> Mary  Soper
> Brian Soper
>
> SELECT first,second FROM members ORDER BY second LIMIT 3
>
> Without looking up either a standards document for SQL or the documentation 
> for your favourite implementations of SQL, answer this question:
>
> Does the documentation for your favourite implementation of SQL state that 
> you'll get the same rows every time you execute the above "SELECT" ?

I think a proper solution for this then is to treat the LIMIT as approximate 
rather than exact; it indicates a desire rather than a promise.

In the scenario you describe, the query should return either 2 rows or 4 rows, 
so that ALL of the rows whose second field value of "Spark" are, or are not, 
returned.  Projecting this to there not being an ORDER BY clause, either all 
rows are returned or zero rows are returned.  Thus the result is deterministic.

Whether returning above or below the limit is done, is a separate thing to 
decide, though I suggest returning above is better.

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread dandl

> I suggest the reason LIMIT hasn't been standardized is that it's contrary
to
> the fundamental idea that rows in a table have no meaningful order.  SQL
> doesn't honor relational theory with complete
> fidelity, but at least that horse is still in the barn.

Point 1: I think you'll find plenty of bits of SQL that are not relational
-- that's not the reason.
Point 2: LIMIT is perfectly relational, as long as it is applied with ORDER
BY. While the data set has no order, that in no way prevents performing a
query that does. Such as:

"Show me the customers that are within the first 10 when ordered by name."

You can't sort the relation, but you can certainly apply an order when
performing a query. How else would MIN() work?

> The problem with LIMIT is it's not based in the data.  Cutting off results
at
> some arbitrary N tells you *nothing* about the data other than that N or
more
> rows met the criteria.  Note that predicate logic has constructs for "for
> all" and "there exists" , but not "are some"!
> 
> I have yet to see a query using LIMIT 1 posted on this list that cannot be
> expressed -- better, IMO -- with min().  Queries that limit the results to
> "top N" to support things like pagination inevitably include assumptions
> about transactionality (or lack thereof) that are either invalid or ill-
> considered.  Every one would be better served either by just fetching the
> needed rows as required (and letting pending rows pend), or by supplying
the
> last "high" value as a minimum for the WHERE clause instead of an OFFSET.
> Were I a fan of conspiracies, I'd suspect the LIMIT-OFFSET constructs were
> invented by antilogicians to prevent learning and hobble performance.

Your criticism re LIMIT is mistaken. It is a perfectly reasonably way to
support pagination, and can actually be performed (rather laboriously) using
SQL Window functions (which Sqlite does not have). It can also be useful in
some algorithms where you need the top 2 or 3 or whatever.

Your criticism re OFFSET has some basis. It is usually better (as you
suggest) to provide a previous row value for pagination. It's just that once
you've done LIMIT, OFFSET is easy to implement and sometimes useful.

> By the way, i'm also a LIMIT club member, with limits.  I use it for
> convenience on the command line while exploring data.  It's great for
that,
> in the absence of pager support.  Maybe keeping it a little "weird" will
help
> remind new developers to use it as a convenience instead of a crutch.

I think most developers are just astonished at how much it differs from one
dialect to another. Astonished, but not impressed.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-15 Thread Simon Slavin

On 15 May 2016, at 6:04am, Darren Duncan  wrote:

> You seem to be forgetting the fact that LIMIT/OFFSET is not its own clause, 
> rather it is an extension to the ORDER BY clause and only has meaning within 
> the context of the ORDER BY it is part of.

All true.  But it brings up a question.  Suppose the following:

first   second
-   --
MarkSpark
Emily   Spark
MarySoper
Brian   Soper

SELECT first,second FROM members ORDER BY second LIMIT 3

Without looking up either a standards document for SQL or the documentation for 
your favourite implementations of SQL, answer this question:

Does the documentation for your favourite implementation of SQL state that 
you'll get the same rows every time you execute the above "SELECT" ?

Simon.

[sqlite] Sqlite incompatibility with Postgres

2016-05-14 Thread J Decker

Someone should wake the SQL standard committee and nominate sqlite
being the broadest used so should fill some gaps in specification with
it. :)  and conversely blame the others for their incompatibility :)

[sqlite] Sqlite incompatibility with Postgres

2016-05-14 Thread Darren Duncan

On 2016-05-14 11:30 AM, James K. Lowden wrote:
> I suggest the reason LIMIT hasn't been standardized is that it's
> contrary to the fundamental idea that rows in a table have no
> meaningful order.  SQL doesn't honor relational theory with complete
> fidelity, but at least that horse is still in the barn.
>
> The problem with LIMIT is it's not based in the data.  Cutting off
> results at some arbitrary N tells you *nothing* about the data other
> than that N or more rows met the criteria.  Note that predicate logic
> has constructs for "for all" and "there exists" , but not "are some"!

You seem to be forgetting the fact that LIMIT/OFFSET is not its own clause, 
rather it is an extension to the ORDER BY clause and only has meaning within 
the 
context of the ORDER BY it is part of.

That being said, one could argue that LIMIT by itself (no offset) could be 
standalone, but then without an ORDER BY all say "LIMIT N" means is "give me a 
random subset of size N of the rows", but then there probably is alternate 
syntax that may say this more explicitly, eg "PICK N".

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-14 Thread James K. Lowden

On Fri, 13 May 2016 15:13:01 +0100
Simon Slavin  wrote:

> On 13 May 2016, at 3:07pm, dandl  wrote:
> 
> > I have no deep knowledge of standard SQL.
> 
> I used to know SQL92 very well.  There's no facility for doing
> anything like LIMIT or OFFSET in it.  You had to use your programming
> language to work your way through all the results and skip the ones
> you didn't want.

I suggest the reason LIMIT hasn't been standardized is that it's
contrary to the fundamental idea that rows in a table have no
meaningful order.  SQL doesn't honor relational theory with complete
fidelity, but at least that horse is still in the barn.   

The problem with LIMIT is it's not based in the data.  Cutting off
results at some arbitrary N tells you *nothing* about the data other
than that N or more rows met the criteria.  Note that predicate logic
has constructs for "for all" and "there exists" , but not "are some"!  

I have yet to see a query using LIMIT 1 posted on this list that cannot
be expressed -- better, IMO -- with min().  Queries that limit the
results to "top N" to support things like pagination inevitably include
assumptions about transactionality (or lack thereof) that are either
invalid or ill-considered.  Every one would be better served either by
just fetching the needed rows as required (and letting pending rows
pend), or by supplying the last "high" value as a minimum for the
WHERE clause instead of an OFFSET.  Were I a fan of conspiracies, I'd
suspect the LIMIT-OFFSET constructs were invented by antilogicians to
prevent learning and hobble performance.  

By the way, i'm also a LIMIT club member, with limits.  I use it for
convenience on the command line while exploring data.  It's great for
that, in the absence of pager support.  Maybe keeping it a little
"weird" will help remind new developers to use it as a convenience
instead of a crutch.  

--jkl

[sqlite] Sqlite incompatibility with Postgres

2016-05-14 Thread dandl

Just what I needed. Ta muchly!

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org


> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Darren Duncan
> Sent: Saturday, 14 May 2016 6:28 AM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Sqlite incompatibility with Postgres
> 
> On 2016-05-13 7:07 AM, dandl wrote:
> > I checked a copy of the
> > 2003 standard and there doesn't seem to be anything similar. I don't
> > have anything later.
> 
> Whitemarsh is your friend.
> 
> http://www.wiscorp.com/SQLStandards.html
> 
> They have a copy of the SQL 2011/2 draft there, under the erroneous title
> "SQL:20nn Working Draft Documents".
> 
> The actual PDF files are datestamped 2011 Dec 22.
> 
> Unless you need something 100% perfect, those are for all intents and
> purposes the same as the official standard.
> 
> I've relied on the up to date texts of that website for the last 15 years
or
> so.
> 
> -- Darren Duncan
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Sqlite incompatibility with Postgres

2016-05-14 Thread dandl

> It would also be very minor to add "UNION -1" as a synonym for "UNION
ALL",
> but "being minor" is not an argument for doing so.
> While the mentorship of Postgres is undoubted, there is/was never a drive,
> nor a need for full (or even partial) compatibility with "Postgres" per
se,
> mostly care is applied to conform or be compatible with the SQL standard
as
> much as possible (much like PostGres'
> philosophy) - At least this is how I read the Dev's statements thus far.

As I said earlier, I just noticed that, although Richard Hipp had publicly
talked about Sqlite having Postgres compatibility, there is fact no common
subset of SQL dialect for this feature. The Andl Sql generator can handle
it, but I thought it warranted a question for confirmation.

> If you can show that the SQL standard likes the "LIMIT ALL" phrasing, or
> argue that it has in it's own right an advantage over "LIMIT -1", then you
> would have a much better case than just saying "But Postgres does it", and
> then it would make sense even if it isn't very minor.

I have no deep knowledge of standard SQL. This article
https://en.wikipedia.org/wiki/Select_%28SQL%29#FETCH_FIRST_clause is not
particularly helpful but does not list this syntax. I checked a copy of the
2003 standard and there doesn't seem to be anything similar. I don't have
anything later.
> 
> All that said, personally I do like the "LIMIT ALL" for clarity and
wouldn't
> mind seeing it implemented.

I agree, but it would be for Postgres compatibility rather than the
standard, I think.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread Dominique Devienne

On Fri, May 13, 2016 at 7:33 PM,   wrote:
> Derby:
> SELECT * FROM schemaTableName FETCH FIRST ROW ONLY
> [...]
> Oracle:
> SELECT * FROM schemaTableName WHERE ROWNUM=1

Oracle 12c added FETCH FIRST ROW ONLY too. --DD

https://oracle-base.com/articles/12c/row-limiting-clause-for-top-n-queries-12cr1#top-n

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread Simon Slavin


On 13 May 2016, at 3:07pm, dandl  wrote:

> I have no deep knowledge of standard SQL.

I used to know SQL92 very well.  There's no facility for doing anything like 
LIMIT or OFFSET in it.  You had to use your programming language to work your 
way through all the results and skip the ones you didn't want.

It is because there was no standard for this that each of the big SQL 
implementations came up with their own syntax, and even now they still haven't 
standardised on one.  I think the syntax SQLite implements is by far the 
simplest to understand, and I'm surprised that the other SQL engines haven't 
all implemented it as well as maintaining their own.

Simon.

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread da...@dandymadeproductions.com

> I used to know SQL92 very well.  There's no facility for doing anything
> like LIMIT or OFFSET in it.  You had to use your programming language to
> work your way through all the results and skip the ones you didn't want.
>
> It is because there was no standard for this that each of the big SQL
> implementations came up with their own syntax, and even now they still
> haven't standardised on one.  I think the syntax SQLite implements is by
> far the simplest to understand, and I'm surprised that the other SQL
> engines haven't all implemented it as well as maintaining their own.
>
> Simon.

SQLite, MySQL, & PostgreSQL do share commonality in this area and
indeed there appears to descrepanices with LIMIT. Try building a
generic SQL Database access tool, one of the main areas of frustration,
Limits.

Select one row from database.

Derby:
SELECT * FROM schemaTableName FETCH FIRST ROW ONLY

HSQL: (Believe it does support post LIMIT now.)
SELECT LIMIT 0 1 * FROM schemaTableName

MSAccess: (Looks like no limit support.)
SELECT * FROM schemaTableName

MSSQL:
SELECT TOP 1 * FROM schemaTableName

MySQL, PostgreSQL, SQLite:
SELECT * FROM schemaTableName LIMIT 1

Oracle:
SELECT * FROM schemaTableName WHERE ROWNUM=1

danap.

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread Darren Duncan

On 2016-05-13 7:07 AM, dandl wrote:
> I checked a copy of the
> 2003 standard and there doesn't seem to be anything similar. I don't have
> anything later.

Whitemarsh is your friend.

http://www.wiscorp.com/SQLStandards.html

They have a copy of the SQL 2011/2 draft there, under the erroneous title 
"SQL:20nn Working Draft Documents".

The actual PDF files are datestamped 2011 Dec 22.

Unless you need something 100% perfect, those are for all intents and purposes 
the same as the official standard.

I've relied on the up to date texts of that website for the last 15 years or so.

-- Darren Duncan

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread dandl

> > Richard Hipp seems to be on record as saying:
> > "SQLite can be thought of as a derivative of PostgreSQL. SQLite was
> > originally written from PostgreSQL 6.5 documentation, and the SQLite
> > developers still use PostgreSQL as a reference platform to verify that
> > SQLite is working correctly."
> >
> > Not a major problem for me, just an interesting footnote.
> >
> 
> The way I recall hearing it expressed in talks I've watched (recorded
after
> the fact, never live) is that when situations arise that need resolution,
the
> question is often asked "what does PostgreSQL do?" Sadly, the backward
> compatibility requirements prohibit SQLite from being a 100% feature /
> implementation match.

Absolutely so. Nevertheless, it would be pretty minor to add LIMIT ALL as a
synonym for LIMIT -1. If you wanted to enhance compatibility.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread R Smith

On 2016/05/13 5:24 AM, dandl wrote:
>>> Richard Hipp seems to be on record as saying:
>>> "SQLite can be thought of as a derivative of PostgreSQL. SQLite was
>>> originally written from PostgreSQL 6.5 documentation, and the SQLite
>>> developers still use PostgreSQL as a reference platform to verify that
>>> SQLite is working correctly."
>>>
>>> Not a major problem for me, just an interesting footnote.
>>>
>> The way I recall hearing it expressed in talks I've watched (recorded
> after
>> the fact, never live) is that when situations arise that need resolution,
> the
>> question is often asked "what does PostgreSQL do?" Sadly, the backward
>> compatibility requirements prohibit SQLite from being a 100% feature /
>> implementation match.
> Absolutely so. Nevertheless, it would be pretty minor to add LIMIT ALL as a
> synonym for LIMIT -1. If you wanted to enhance compatibility.

It would also be very minor to add "UNION -1" as a synonym for "UNION 
ALL", but "being minor" is not an argument for doing so.
While the mentorship of Postgres is undoubted, there is/was never a 
drive, nor a need for full (or even partial) compatibility with 
"Postgres" per se, mostly care is applied to conform or be compatible 
with the SQL standard as much as possible (much like PostGres' 
philosophy) - At least this is how I read the Dev's statements thus far.

If you can show that the SQL standard likes the "LIMIT ALL" phrasing, or 
argue that it has in it's own right an advantage over "LIMIT -1", then 
you would have a much better case than just saying "But Postgres does 
it", and then it would make sense even if it isn't very minor.

All that said, personally I do like the "LIMIT ALL" for clarity and 
wouldn't mind seeing it implemented.

Cheers,
Ryan

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread dandl

> > Sqlite accepts (but Postgres does not):
> > LIMIT -1 OFFSET nnn
> > LIMIT -1
> >
> > These all have the same meaning of no limit, but there is no common
> > ground in the syntax.
> 
> Yes and Yes.  Documented behaviour.  There are some strange situations if
the
> two clauses can't both be satisfied, but if you're not playing silly
tricks
> you can depend on negative limits.

Thanks. I ran across this:
http://www.pgcon.org/2014/schedule/events/736.en.html

Richard Hipp seems to be on record as saying:
"SQLite can be thought of as a derivative of PostgreSQL. SQLite was
originally written from PostgreSQL 6.5 documentation, and the SQLite
developers still use PostgreSQL as a reference platform to verify that
SQLite is working correctly."

Not a major problem for me, just an interesting footnote.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-13 Thread Scott Robison

Fair enough.

On Thu, May 12, 2016 at 9:24 PM, dandl  wrote:

> > > Richard Hipp seems to be on record as saying:
> > > "SQLite can be thought of as a derivative of PostgreSQL. SQLite was
> > > originally written from PostgreSQL 6.5 documentation, and the SQLite
> > > developers still use PostgreSQL as a reference platform to verify that
> > > SQLite is working correctly."
> > >
> > > Not a major problem for me, just an interesting footnote.
> > >
> >
> > The way I recall hearing it expressed in talks I've watched (recorded
> after
> > the fact, never live) is that when situations arise that need resolution,
> the
> > question is often asked "what does PostgreSQL do?" Sadly, the backward
> > compatibility requirements prohibit SQLite from being a 100% feature /
> > implementation match.
>
> Absolutely so. Nevertheless, it would be pretty minor to add LIMIT ALL as a
> synonym for LIMIT -1. If you wanted to enhance compatibility.
>
> Regards
> David M Bennett FACS
>
> Andl - A New Database Language - andl.org
>
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Scott Robison

[sqlite] Sqlite incompatibility with Postgres

2016-05-12 Thread dandl

It's a minor point, but can someone confirm that:

Postgres accepts (but Sqlite does not)
LIMIT ALL
LIMIT ALL OFFSET nnn
OFFSET nnn

Sqlite accepts (but Postgres does not):
LIMIT -1 OFFSET nnn
LIMIT -1

These all have the same meaning of no limit, but there is no common ground
in the syntax.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org

[sqlite] Sqlite incompatibility with Postgres

2016-05-12 Thread Scott Robison

On Thu, May 12, 2016 at 6:07 PM, dandl  wrote:

> > > Sqlite accepts (but Postgres does not):
> > > LIMIT -1 OFFSET nnn
> > > LIMIT -1
> > >
> > > These all have the same meaning of no limit, but there is no common
> > > ground in the syntax.
> >
> > Yes and Yes.  Documented behaviour.  There are some strange situations if
> the
> > two clauses can't both be satisfied, but if you're not playing silly
> tricks
> > you can depend on negative limits.
>
> Thanks. I ran across this:
> http://www.pgcon.org/2014/schedule/events/736.en.html
>
> Richard Hipp seems to be on record as saying:
> "SQLite can be thought of as a derivative of PostgreSQL. SQLite was
> originally written from PostgreSQL 6.5 documentation, and the SQLite
> developers still use PostgreSQL as a reference platform to verify that
> SQLite is working correctly."
>
> Not a major problem for me, just an interesting footnote.
>

The way I recall hearing it expressed in talks I've watched (recorded after
the fact, never live) is that when situations arise that need resolution,
the question is often asked "what does PostgreSQL do?" Sadly, the backward
compatibility requirements prohibit SQLite from being a 100% feature /
implementation match.

-- 
Scott Robison

[sqlite] Sqlite incompatibility with Postgres

2016-05-12 Thread Simon Slavin

On 12 May 2016, at 11:24am, dandl  wrote:

> Sqlite accepts (but Postgres does not):
> LIMIT -1 OFFSET nnn
> LIMIT -1
> 
> These all have the same meaning of no limit, but there is no common ground
> in the syntax.

Yes and Yes.  Documented behaviour.  There are some strange situations if the 
two clauses can't both be satisfied, but if you're not playing silly tricks you 
can depend on negative limits.

Simon.

[sqlite] Sqlite incompatibility with Postgres

2016-05-12 Thread da...@dandymadeproductions.com

> It's a minor point, but can someone confirm that:
>
> Postgres accepts (but Sqlite does not)
> LIMIT ALL
> LIMIT ALL OFFSET nnn
> OFFSET nnn
>
> Sqlite accepts (but Postgres does not):
> LIMIT -1 OFFSET nnn
> LIMIT -1
>
> These all have the same meaning of no limit, but there is no common
> ground in the syntax.
>
> Regards
> David M Bennett FACS
>
> Andl - A New Database Language - andl.org

According to my testing, correct.

PostgreSQL
SQLException: ERROR: LIMIT must not be negative SQLState: 2201W
VendorError: 0

SQLite
SQLException: [SQLITE_ERROR] SQL error or missing database (near
"ALL":syntax error) SQLState: null VendorError: 1

SELECT * FROM key_table4 OFFSET 2
SQLException: [SQLITE_ERROR] SQL error or missing database (near "2":
syntax error) SQLState: null VendorError: 1

49 matches

Mail list logo