Re: [sqlite] Once again about random values appearance

2014-02-17 Thread Max Vlasov
On Mon, Feb 17, 2014 at 7:00 PM, RSmith  wrote:
>
>
> On 2014/02/17 09:59, Max Vlasov wrote:
>>
>> .
>> So
>>
>>Select nanosec() - nanosec() from ...
>>
>> returns non-zero values for most of the times, so there's no guarantee the
>> user functions or any other functions will be called once for the step.//... 
>> etc.
>
>
> Did you mark your nanosec function as SQLITE_DETERMINISTIC 
> ?
> http://www.sqlite.org/c3ref/create_function.html
>
> Which, if not, it can and will very much return non-zero values.
>

Very interesting option, it did affected the results, they're now zero
for both cases. Also I see this quite a young option listed as a new
feature of 3.8.3.

"Added SQLITE_DETERMINISTIC as an optional bit in the 4th argument to
the sqlite3_create_function() and related interfaces, providing
applications with the ability to create new functions that can be
factored out of inner loops when they have constant arguments"

So the query with this option

  Select nanosec() as v from TestTable where v<>v

always returns empty dataset.

But it seems this option still has something to explain since

  Select nanosec() - nanosec() from TestTable

returns always zeros while

  Select nanosec(), nanosec() from TestTable

returns different values for fields

Either nanosec() - nanosec() is much faster than the granularity of
performance counter on average windows hardware or they are different
cases for some reasons.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-17 Thread RSmith


On 2014/02/17 09:59, Max Vlasov wrote:


Ok, I hope I found the topic, the title was
   "racing with date('now') (was: Select with dates):
one of the links to the archive
   https://www.mail-archive.com/sqlite-users@sqlite.org/msg79456.html

CMIIW, but as I see it, the final modification was commented by Richard

  > As a compromise, the current SQLite trunk causes 'now' to be
exactly  the
  > same for all date and time functions within a single sqlite3_step()
call.

But this is just for now and date-related functions. I wanted to be sure so
created a user function NanoSec() that returns  nanoseconds as it is
calculated with QueryPerformanceCounter and QueryPerformanceFrequency on
Windows and clock_gettime(CLOCK_REALTIME... on Linux. Seems like it's not
always real nanoseconds but value that is changed very frequently to be
different for close VDBE instructions of sqlite engine.

So

   Select nanosec() - nanosec() from ...

returns non-zero values for most of the times, so there's no guarantee the
user functions or any other functions will be called once for the step.//... 
etc.


Did you mark your nanosec function as SQLITE_DETERMINISTIC 
?
http://www.sqlite.org/c3ref/create_function.html

Which, if not, it can and will very much return non-zero values.

And if you did, either your function or your version of SQLite is broken.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-17 Thread Max Vlasov
On Mon, Feb 17, 2014 at 2:27 PM, Simon Slavin  wrote:

>
> On 17 Feb 2014, at 7:59am, Max Vlasov  wrote:
>
> > So the nanosec example modified
> >
> > Select v-v from
> > (
> >   Select nanosec() as v from TestTable
> > )
> >
> > ...shows non-zero values for the current (3.8.3) and for older (3.6.10)
> > version.
>
> Erm ... that worries me a little.
>
> I don't know how SQLite handles sub-selects internally.  But the
> conventional way of doing them is a JOIN to a VIEW.  Which means that v-v
> should always equal zero.
>


Explain lit a litle light
...
"4""Function""0""NanoSec(0)"
"5""Function""0""NanoSec(0)"
"6""Subtract""2"
"7""ResultRow""1"
"8""Next""4"
.
So no intermediate storage probably for performance reasons. Also the
listing looks very self-explainable while possible stateful one will add
more complexity to the VDBE code.


>
> By the way, when figuring out how to optimize this still it's worth noting
> that a parameterless function is rare in SQL.  It's rarely worth optimizing
> unnamed expressions because you rarely get the same unnamed expression
> (including parameters) used twice.
>
>
Simon, I see your point, next time my test function will depend on at least
a dozen of very important parameters and will have a very, very, very long
name :)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-17 Thread Simon Slavin

On 17 Feb 2014, at 7:59am, Max Vlasov  wrote:

> So the nanosec example modified
> 
> Select v-v from
> (
>   Select nanosec() as v from TestTable
> )
> 
> ...shows non-zero values for the current (3.8.3) and for older (3.6.10)
> version.

Erm ... that worries me a little.

I don't know how SQLite handles sub-selects internally.  But the conventional 
way of doing them is a JOIN to a VIEW.  Which means that v-v should always 
equal zero.

By the way, when figuring out how to optimize this still it's worth noting that 
a parameterless function is rare in SQL.  It's rarely worth optimizing unnamed 
expressions because you rarely get the same unnamed expression (including 
parameters) used twice.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-17 Thread Yuriy Kaminskiy
James K. Lowden wrote:
> On Fri, 14 Feb 2014 08:32:02 +0400
> Max Vlasov <max.vla...@gmail.com> wrote:
> 
>> From: Max Vlasov <max.vla...@gmail.com>
>> To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
>> Reply-To: General Discussion of SQLite Database
>> <sqlite-users@sqlite.org> Date: Fri, 14 Feb 2014 08:32:02 +0400
>> Subject: Re: [sqlite] Once again about random values appearance
>>
>> On Fri, Feb 14, 2014 at 6:35 AM, James K. Lowden
>> <jklow...@schemamania.org>wrote:
>>
>>>>> select id, (select id from TestTable where id = abs(random() %
>>>>> 100)) as rndid from TestTable where id=rndid
>>> On Thu, 13 Feb 2014 07:26:55 -0500
>>> Richard Hipp <d...@sqlite.org> wrote:
>>>
>>>> It is undefined behavior, subject to change depending the specific
>>>> version of SQLite, compile-time options, optimization settings,
>>>> and the whim of the query planner.
>>> It should be defined.
>>>
>>> In the above query, random() should be evaluated once.  In the SQL
>>> model the user provides inputs once, and the system evaluates them
>>> once.
>>>
>>>
>> Once for the query or once for the row?
> 
> Once for the query.  
> 
> As a user you have no control how the system evaluates your query.
> The evaluation may change over time with different implementations, but
> the semantics of the query do not.  
> 
> Not long ago on this list we discussed 
> 
>   SELECT *, datetime('now') from T;
> 
> and the behavior was that the datetime function was called per-row,
> resulting in different times on different rows.  It was changed, the
> rationale IIRC to be compatible with the SQL standard and other DBMSs.  
> 
> Like datetime, random() is a nondeterministic function taking constant
> (but void) input.  It should be evaluated once, as provided. Where it
> appears in your query should not matter.  That would make it consistent
> with how other functions work, and with the SQL standard.  

Good in theory, bad in practice.
CURRENT_TIME = CURRENT_TIME was /almost always/ true before change (when it
happened to be false, it was /surprising/ and /counter-intuitive/ exception),
and just become /always/ true after change (no surprising change in behavior!).

RANDOM() = RANDOM() is /always false/ now, it will become /always true/ after
your suggested change.
It have much higher potential to break existing code, e.g.
   SELECT * FROM t ORDER BY RANDOM() LIMIT 1
will be completely broken (well, sure, it is not best way to select random
row(s) from table - it is overly expensive and unportable - but still much more
likely to be used - and relied on - in existing code).

Besides, datetime function behavior is specified by SQL standard - and RANDOM()
is not.

Then again, I, too, don't quite like how OP's queries - such as
   SELECT r <> r FROM (SELECT RANDOM() AS r FROM t)
behavior depends on whether query flattening was used or not (even though one
can write it off as "documented undefined behavior").
Or, that `SELECT RANDOM() AS r FROM t ORDER BY r` gives quite unexpected result.
I just don't think that same approach as with datetime function would be good in
this case.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-16 Thread Max Vlasov
On Mon, Feb 17, 2014 at 1:22 AM, James K. Lowden
<jklow...@schemamania.org>wrote:

> On Fri, 14 Feb 2014 08:32:02 +0400
> Max Vlasov <max.vla...@gmail.com> wrote:
>
> > From: Max Vlasov <max.vla...@gmail.com>
> > To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
> > Reply-To: General Discussion of SQLite Database
> > <sqlite-users@sqlite.org> Date: Fri, 14 Feb 2014 08:32:02 +0400
> > Subject: Re: [sqlite] Once again about random values appearance
> >
> > On Fri, Feb 14, 2014 at 6:35 AM, James K. Lowden
> > <jklow...@schemamania.org>wrote:
> >
> > > > > select id, (select id from TestTable where id = abs(random() %
> > > > > 100)) as rndid from TestTable where id=rndid
> > >
> > > On Thu, 13 Feb 2014 07:26:55 -0500
> > > Richard Hipp <d...@sqlite.org> wrote:
> > >
> > > > It is undefined behavior, subject to change depending the specific
> > > > version of SQLite, compile-time options, optimization settings,
> > > > and the whim of the query planner.
> > >
> > > It should be defined.
> > >
> > > In the above query, random() should be evaluated once.  In the SQL
> > > model the user provides inputs once, and the system evaluates them
> > > once.
> > >
> > >
> > Once for the query or once for the row?
>
> Once for the query.
>
> As a user you have no control how the system evaluates your query.
> The evaluation may change over time with different implementations, but
> the semantics of the query do not.
>
> Not long ago on this list we discussed
>
> SELECT *, datetime('now') from T;
>
> and the behavior was that the datetime function was called per-row,
> resulting in different times on different rows.  It was changed, the
> rationale IIRC to be compatible with the SQL standard and other DBMSs.
>


Ok, I hope I found the topic, the title was
  "racing with date('now') (was: Select with dates):
one of the links to the archive
  https://www.mail-archive.com/sqlite-users@sqlite.org/msg79456.html

CMIIW, but as I see it, the final modification was commented by Richard

 > As a compromise, the current SQLite trunk causes 'now' to be
exactly  the
 > same for all date and time functions within a single sqlite3_step()
call.

But this is just for now and date-related functions. I wanted to be sure so
created a user function NanoSec() that returns  nanoseconds as it is
calculated with QueryPerformanceCounter and QueryPerformanceFrequency on
Windows and clock_gettime(CLOCK_REALTIME... on Linux. Seems like it's not
always real nanoseconds but value that is changed very frequently to be
different for close VDBE instructions of sqlite engine.

So

  Select nanosec() - nanosec() from ...

returns non-zero values for most of the times, so there's no guarantee the
user functions or any other functions will be called once for the step.

My original issue was commented by Richard and there's no other argument I
can post because ability ot use alias in the WHERE clause is probably the
sqlite-only feature so you just can't reference any standard about this.

But.. your first reply triggered another problem, it's where outer query
uses alias from the inner query and here sqlite can be compared with other
engines. So the nanosec example modified

 Select v-v from
 (
   Select nanosec() as v from TestTable
 )

...shows non-zero values for the current (3.8.3) and for older (3.6.10)
version. And here it would be interesting to know whether any standard has
something to tell about the value of v in the outer query.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-16 Thread James K. Lowden
On Fri, 14 Feb 2014 08:32:02 +0400
Max Vlasov <max.vla...@gmail.com> wrote:

> From: Max Vlasov <max.vla...@gmail.com>
> To: General Discussion of SQLite Database <sqlite-users@sqlite.org>
> Reply-To: General Discussion of SQLite Database
> <sqlite-users@sqlite.org> Date: Fri, 14 Feb 2014 08:32:02 +0400
> Subject: Re: [sqlite] Once again about random values appearance
> 
> On Fri, Feb 14, 2014 at 6:35 AM, James K. Lowden
> <jklow...@schemamania.org>wrote:
> 
> > > > select id, (select id from TestTable where id = abs(random() %
> > > > 100)) as rndid from TestTable where id=rndid
> >
> > On Thu, 13 Feb 2014 07:26:55 -0500
> > Richard Hipp <d...@sqlite.org> wrote:
> >
> > > It is undefined behavior, subject to change depending the specific
> > > version of SQLite, compile-time options, optimization settings,
> > > and the whim of the query planner.
> >
> > It should be defined.
> >
> > In the above query, random() should be evaluated once.  In the SQL
> > model the user provides inputs once, and the system evaluates them
> > once.
> >
> >
> Once for the query or once for the row?

Once for the query.  

As a user you have no control how the system evaluates your query.
The evaluation may change over time with different implementations, but
the semantics of the query do not.  

Not long ago on this list we discussed 

SELECT *, datetime('now') from T;

and the behavior was that the datetime function was called per-row,
resulting in different times on different rows.  It was changed, the
rationale IIRC to be compatible with the SQL standard and other DBMSs.  

Like datetime, random() is a nondeterministic function taking constant
(but void) input.  It should be evaluated once, as provided. Where it
appears in your query should not matter.  That would make it consistent
with how other functions work, and with the SQL standard.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-13 Thread Max Vlasov
On Fri, Feb 14, 2014 at 6:35 AM, James K. Lowden
wrote:

> > > select id, (select id from TestTable where id = abs(random() % 100))
> > > as rndid from TestTable where id=rndid
>
> On Thu, 13 Feb 2014 07:26:55 -0500
> Richard Hipp  wrote:
>
> > It is undefined behavior, subject to change depending the specific
> > version of SQLite, compile-time options, optimization settings, and
> > the whim of the query planner.
>
> It should be defined.
>
> In the above query, random() should be evaluated once.  In the SQL
> model the user provides inputs once, and the system evaluates them
> once.
>
>
Once for the query or once for the row?
It does evaluated once for the query. To check, just remove where

  select id, (select id from TestTable where id = abs(random() % 100)) as
rndid from TestTable

and we have plenty of the same rndid as the right column

I almost understood the logic behind this, but just found the query that
surprised me once again. When I wrote it, I wanted to show that to compare
engines we have to remove alias usage from WHERE because most of engines
forbid using it in WHERE. But the following one that probably should be
compatible with other RDBMSes still shows different column values

select id, rndid from
(
select id, (select id from TestTable where id = abs(random() % 100)) as
rndid from TestTable
) where id=rndid

But I supposed the rndid in outer query should have known (fixed) value and
the values from where clause should be shown identical.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-13 Thread James K. Lowden
> > select id, (select id from TestTable where id = abs(random() % 100))
> > as rndid from TestTable where id=rndid

On Thu, 13 Feb 2014 07:26:55 -0500
Richard Hipp  wrote:

> It is undefined behavior, subject to change depending the specific
> version of SQLite, compile-time options, optimization settings, and
> the whim of the query planner.

It should be defined.  

In the above query, random() should be evaluated once.  In the SQL
model the user provides inputs once, and the system evaluates them
once.  

This issue seems to have a way of returning.  Didn't we recently
discuss it with regard to time?  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-13 Thread Max Vlasov
On Thu, Feb 13, 2014 at 4:26 PM, Richard Hipp  wrote:

> On Thu, Feb 13, 2014 at 4:45 AM, Max Vlasov  wrote:
>
> > Hi,
> >
> > probably was discussed and modified before, but I still can not
> understand
> > some peculiarities with random column values.
> >
>
> It is undefined behavior, subject to change depending the specific version
> of SQLite, compile-time options, optimization settings, and the whim of the
> query planner.
>
>

Thanks, this makes sense.

Assuming there are time related functions, user function with undefined
values etc, may I generalize this into "an expression is guaranteed to have
the same value in any context only if it is always return the same output
for the same input"?

Interesting, the problem probably arises since sometimes the symbols in the
result columns are easy to interpret mentally like "states" to be used
further, but they look more like macros to be substituted.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Once again about random values appearance

2014-02-13 Thread Richard Hipp
On Thu, Feb 13, 2014 at 4:45 AM, Max Vlasov  wrote:

> Hi,
>
> probably was discussed and modified before, but I still can not understand
> some peculiarities with random column values.
>
> The table
>   Create table [TestTable] ([id] integer primary key)
>
> populated with 100 default values (thanks to CTE now made with a single
> query):
>
> with recursive
>   autoinc(id) as
>  (values(1) UNION ALL Select id+1 from autoinc where id < 100)
> insert into TestTable select id from autoinc
>
>
> So the following query
>
> select id, (select id from TestTable where id = abs(random() % 100)) as
> rndid from TestTable where id=rndid
>
> returns a single row (as expected), but this row contains different values
> for id and rndid. I suppose this is because rndid to be shown is calculated
> once, but the one used for comparison is calculated again and again. Is
> this correct behavior?
>

It is undefined behavior, subject to change depending the specific version
of SQLite, compile-time options, optimization settings, and the whim of the
query planner.

By analogy, in C/C++ if you write:

  int x = 1;
  int y = x++/x++;

What is the value for y?  0, 1, or 2?


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Once again about random values appearance

2014-02-13 Thread Max Vlasov
Hi,

probably was discussed and modified before, but I still can not understand
some peculiarities with random column values.

The table
  Create table [TestTable] ([id] integer primary key)

populated with 100 default values (thanks to CTE now made with a single
query):

with recursive
  autoinc(id) as
 (values(1) UNION ALL Select id+1 from autoinc where id < 100)
insert into TestTable select id from autoinc


So the following query

select id, (select id from TestTable where id = abs(random() % 100)) as
rndid from TestTable where id=rndid

returns a single row (as expected), but this row contains different values
for id and rndid. I suppose this is because rndid to be shown is calculated
once, but the one used for comparison is calculated again and again. Is
this correct behavior?

Also checked several dll versions and saw that this change appeared
somewhere between 3.6.13 and 3.6.16, so .13 showed identical values, while
.16 different.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users