Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread pilot pirx
Thank you and AndrewP for the pointers - very interesting read. 

Some clarifications. Coming from my perspective - that of R user, 
not an API user - it is natural to do more 
complicated operations in R than to try to write additional
functions in C/tcl etc. I do not know how common will such
attitude be among SQLite users not using R. (In general R
users tend to be statisticians with little knowledge of SQL.)
This is why my mail was not an attempt to propose some
extensions to core SQLite [it is so good for the intended
purpose that adding more stuff can spoil it :-)]

Were any additions for computing considered for SQLITE 
it seems plausible to argue that the should be easy to
implemenent and do not change the spirit of the project.
So more like 'adding a log function' rather than
something (probably) much bigger like Oracle olap.

It could be further argued that, for all such ideas for
SQLite extentions, the strategic approach of having 
'the core sqlite' and 'the extended sqlite' is the
best way to proceed. Any non-conventional extensions
(like log or OLAP) could be implemented in the 'extended
sqlite' version, used from there - and eventually migrated to 
the core if there is a sufficient interest.










- Original Message -
From: [EMAIL PROTECTED]
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] windowing functions != recursive functions
Date: Wed, 12 Oct 2005 22:48:35 -0400

> 
> "pilot pirx" <[EMAIL PROTECTED]> wrote:
> 
> > Now, for the recursive function like exponential moving average 
> > the defintion is that
> >
> > ema(i+1) =  val(i) * coef  + ema(i) * (1-coef).
> >
> > That is I have to know the previous value of both EMA _and_  
> > VALUE (while for moving avearage I need to know _only_ the 
> > previous value(s)
> > of VALUE.
> 
> You could write an "ema()" function for SQLite using the
> scarcely documented API functions sqlite3_get_auxdata() and
> sqlite3_set_auxdata().  (Those routines were intended to allow
> functions like "regexp" to compile a constant regular expression
> once and then reused the compiled regular expression on
> subsequent calls.  But they have never been used for anything,
> as far as I am aware.)
> 
> The ema() function would work like this:
> 
> SELECT ema(x, 0.45) FROM table1;
> 
> Where 0.45 is the "coef".
> 
> I was wondering if it would be possible to write a "prev()"
> function that returned the value of a column in the result
> set from the previous row.  prev() could be used to implement
> ema() in pure SQL.  But, alas, I do not see how you could
> write a general-purpose prev() function using the current
> API.  Some kind of extension would be required, I think.
> --
> D. Richard Hipp <[EMAIL PROTECTED]>


-- 
___
Play 100s of games for FREE! http://games.mail.com/



Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread Andrew Piskorski
On Wed, Oct 12, 2005 at 09:05:26PM -0500, pilot pirx wrote:

> There is no problem in computing moving average
> or cumulative sum etc 
> with the existing SQL (or dealing with time windows
> in general)  - it just the SQL gets nasty - examples

Except that the SQL doesn't "just" get nasty, it can quickly get
EXTREMELY nasty, and often has really lousy performance as well.

I regularly write (fairly) readable, fast queries using (Oracle's)
windowing/OLAP functions which AFAICT, would be either extremely
difficult or essentially impossible for me to write in SQL-92, other
than, perhaps, by writing some special purpose SQL compiler/translator
to do the job.

E.g., see this lengthy thread from 2001:

  http://openacs.org/forums/message-view?message_id=25521
  http://openacs.org/forums/message-view?message_id=25559
  http://openacs.org/forums/message-view?message_id=25607

That thread dates from before I'd ever used Oracle's windowing
functions.  Note that first non-OLAP SQL query for doing "time-series"
like stuff.  It's only moderately complicated, but it is also trying
to accomplish something conceptually VERY simple, yet its SQL is NOT
simple.  Writing more complicated queries in that fashion quickly
becomes infeasible.

Furthermore, that initial query also took about 3 seconds to run.  My
second non-OLAP way of doing the same query, arguably somewhat
clearer, took 75 seconds.

At the end of that same thread, after intervening confusion, Daryl
Biberdorf pointed out the OLAP lead() and lag() functions to me.  By
using lead(), the SQL becomes much easier to write and understand, AND
its execution become enormously faster - down to 0.1 seconds or so.

-- 
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/


Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread Andrew Piskorski
On Wed, Oct 12, 2005 at 09:05:26PM -0500, pilot pirx wrote:

> The windowing functions described in the link
> are different from recursive functions. 

Yes, I think you're right.  Your EMA example bugged me, so I fooled
with it, but I couldn't come up with any way to implement EMA using
plain SQL, even with the windowing/OLAP functions.

Some explanations of the EMA algorithm are here:

  
http://www.investopedia.com/university/tm/TradingIndicators/TheInsAndOutsOfMovingAverages.asp
  http://www.ivorix.com/en/products/tech/smooth/ema.html

Looks like you don't even need real recursion for EMA (AKA, it is
tail-recursive), plain old iteration will do fine, as you can see from
the simple C++ code in the 2nd link above.  (Yet it can't be done in
SQL.  Well we all knew that SQL isn't Turing complete, here's a
practical consequence of that, I guess.)

> Now, for the recursive function like exponential moving average the
> defintion is that ema(i+1) = val(i) * coef + ema(i) * (1-coef). That
> is I have to know the previous value of both EMA _and_ VALUE (while
> for moving avearage I need to know _only_ the previous value(s) of
> VALUE.

It is quite possible to return previous (lagged) values of multiple
columns, NOT just one.  You do that with dense_rank, which I talked
about a back on May 23 on this email list:

  I always thought of that feature as "look-aside", as in, "Find the max
  of foo, but don't give me that, instead look aside and return me the
  corresponding value of bar next to it instead."

  select
max(playerid) keep (dense_rank last order by sb) as top_stealer
  from batting

The problem is that EMA needs to use the lag of the actual column it's
calculating, and there's just no damn way to do that in SQL.

Well, I didn't try it, but maybe it would be possible to kludge EMA
somehow using either Oracle's "connect by", or DB2's recursive SQL,
which sounds possibly and less hideous than connect by:

  
http://www-128.ibm.com/developerworks/db2/library/techarticle/0307steinbach/0307steinbach.html

Here's the example I tried for EMA (on Oracle 8.1.7.4):

create table atp_ema (symbol varchar2(8)  ,day date  ,val number); 
insert into atp_ema values ('a' ,'2005-01-03' ,67); 
insert into atp_ema values ('a' ,'2005-01-04' ,77); 
insert into atp_ema values ('a' ,'2005-01-05' ,80); 
insert into atp_ema values ('a' ,'2005-01-06' ,82); 
insert into atp_ema values ('a' ,'2005-01-07' ,94); 
insert into atp_ema values ('a' ,'2005-01-10' ,81); 
insert into atp_ema values ('a' ,'2005-01-11' ,83); 
insert into atp_ema values ('b' ,'2005-01-03' ,27); 
insert into atp_ema values ('b' ,'2005-01-04' ,27); 
insert into atp_ema values ('b' ,'2005-01-05' ,20); 
insert into atp_ema values ('b' ,'2005-01-06' ,22); 
insert into atp_ema values ('b' ,'2005-01-07' ,24); 
insert into atp_ema values ('b' ,'2005-01-10' ,21); 
insert into atp_ema values ('b' ,'2005-01-11' ,23); 
 
variable coeff number; 
begin  :coeff := 0.66;  end; 
/ 

-- Correct algorithm, but can't run: 
select  v.* 
  ,(ema_term_1 + ema_term_2)  as ema 
  ,(avg(val) over (partition by symbol  order by day 
  rows between 3 preceding and current row))  as mean_4_day 
from ( 
  select  symbol ,day ,val 
,(:coeff * val) as ema_term_1 
,( (1 - :coeff) * 
   -- Of course ema is not defined here, so this fails with 
   -- 'ORA-00904: invalid column name': 
   nvl((lag(ema ,1) over (partition by symbol  order by day))
   ,val) 
 ) as ema_term_2 
  from atp_ema 
) v 
order by symbol ,day ; 

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com/



Re: [sqlite] windowing functions != recursive functions

2005-10-12 Thread drh
"pilot pirx" <[EMAIL PROTECTED]> wrote:
 
> Now, for the recursive function like exponential moving average 
> the defintion is that
>
> ema(i+1) =  val(i) * coef  + ema(i) * (1-coef).
>
> That is I have to know the previous value of both EMA _and_  VALUE 
> (while for moving avearage I need to know _only_ the previous value(s)
> of VALUE. 
> 

You could write an "ema()" function for SQLite using the
scarcely documented API functions sqlite3_get_auxdata() and
sqlite3_set_auxdata().  (Those routines were intended to allow
functions like "regexp" to compile a constant regular expression
once and then reused the compiled regular expression on
subsequent calls.  But they have never been used for anything,
as far as I am aware.)

The ema() function would work like this:

   SELECT ema(x, 0.45) FROM table1;

Where 0.45 is the "coef".

I was wondering if it would be possible to write a "prev()"
function that returned the value of a column in the result
set from the previous row.  prev() could be used to implement
ema() in pure SQL.  But, alas, I do not see how you could
write a general-purpose prev() function using the current
API.  Some kind of extension would be required, I think.
--
D. Richard Hipp <[EMAIL PROTECTED]>