I did use R extensively for years and with SQLite for the last year. 
My observations, for what they are worth, about statistics and databases.

A stats package, especially as powerful as R, makes any database functions
less relevant. After trying various approaches I use now the database mostly 
 - to reduce data somewhat before getting them to R. 
   For example, with 2 mln records in 500 groups I can compute group 
   averages in the database and read into R only 500 records. 
- to reduce overall memory requirements - I can process the whole 
  data set on group-by-group basis - reading one group at a time.
  thus requiring less memory internally

Any stats package will have many functions, most of them impossible to
implement in standard SQL and very difficult to implement in general.
(like clustering etc).

So why do I write filters in the database, instead of using vastly superior 
R capabilities? Firstly, for fun - just liked to push SQL as far as possible.
Secondly - it is sometimes useful to compute some basic things during
or immediately after data acquisition. If we can compute some 
_simple_ metrics at that stage _quickly_ then R does not have to
read and process that much, as only data withing some range of metric
may be of interest. Also, we can index on a metric and provide very
fast extraction of data subsets.

In summary it seems to me that adding too heavy functions to any database
may be difficult and, for SQLite, going against the basic idea of 
having a 'lite' db. 






----- Original Message -----
From: Laurent <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 15:36:22 +0200

> 
> Hello,
> 
> I was just looking for a statiscal package linked with SQLITE.
> >
> > Using SQLite in conjunction with a powerful statistical data analysis
> > programming language like R is an excellent example of a use where
> > windowing functions can be hugely helpful.  Unfortunately, I've never
> > had a compelling need to use SQLite for that, otherwise I'd probably
> > take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> >
> I can confirm that there would be some interest in having such a library.
> 
> Best regards,
> 
> Laurent.
> 
> ==
> 
> ----- Original Message -----
> From: "Andrew Piskorski" <[EMAIL PROTECTED]>
> To: <sqlite-users@sqlite.org>
> Sent: Wednesday, October 12, 2005 2:34 PM
> Subject: [sqlite] SQL Window/OLAP functions
> 
> 
> > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > > Subject: [sqlite] Please, please do _not_ remove this feature from
> SQLite...
> >
> > > While using SQLite for some time (with R package, www.r-project.org)
> > > I did admire its functionality and speed. Then I did discover a
> > > hidden SQLite feature of immense usefulness - not available in other
> > > databases. SQLite can compute Fibonacci numbers! (I will explain why
> >
> > Transaction visibility features do vary, although often it doesn't
> > matter anyway.  E.g., here's a dicussion of how (at least as of early
> > 2004), PostgreSQL's docs were quite confused about certain subtleties,
> > but what I find interesting, is this was still something that in
> > practice had never really mattered to the mostly hard-core RDBMS
> > programmers talking about it in that thread:
> >
> >   http://openacs.org/forums/message-view?message_id=176198
> >
> > > UPDATE fib SET
> > >     val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> > >            (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > > WHERE pos > 2;
> >
> > I don't see why this is such a great feature.  Without it, worst case,
> > you could still write a simple little loop which would issue one
> > update statement for each row, all within a single transaction.  No?
> >
> > > This is an _immensely_ useful functionality when one needs to
> > > compute various recursive functions. For example exponential moving
> > > average, used frequently in financials. Or Kalman filter (and many
> >
> > Vastly more useful for moving average and the like would be real
> > windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> > not thrilled by their particular syntax, but the functionality is
> > INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> > obviously better syntax, either.)
> >
> > Hm, an amendement to the SQL:1999 spec added windowing support, and
> > SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> > functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
> > of Oracle was the author of that SQL spec, and IBM also supported it,
> > so the SQL:2003 syntax and behavior is probably very similar (maybe
> > identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
> > PostgreSQL, as of 8.0, doesn't support it yet.
> >
> >   http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html
> >   http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf
> >   http://www.wiscorp.com/sql/SQL2003Features.pdf
> >   http://troels.arvin.dk/db/rdbms/#select-limit-offset
> >   http://www.postgresql.org/docs/8.0/interactive/features.html
> >   http://en.wikipedia.org/wiki/SQL
> >   http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf
> >   http://www.oracle.com/oramag/oracle/01-jul/o41industry.html
> >
> > SQLite basically supports just SQL-92, it doesn't have any of these
> > newer SQL:1999 or SQL:2003 features, right?
> >
> > Using SQLite in conjunction with a powerful statistical data analysis
> > programming language like R is an excellent example of a use where
> > windowing functions can be hugely helpful.  Unfortunately, I've never
> > had a compelling need to use SQLite for that, otherwise I'd probably
> > take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> >
> > -- Andrew Piskorski <[EMAIL PROTECTED]>
> > http://www.piskorski.com/
> >


-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm

Reply via email to