Re: [sqlite] SQLite bug report - large databases only - 'database or disk is full'

2009-12-12 Thread pirx

Apologies for not answering earlier, I* was away.
I've never used Process Monitor - the whole setup you describing is new to me - 
but I will try to do something about it in the coming week. It has to be a 
background project for me - I do not use SQLite in production. 




-Original Message-
From: Filip Navara 
To: General Discussion of SQLite Database 
Sent: Thu, Dec 10, 2009 1:35 am
Subject: Re: [sqlite] SQLite bug report - large databases only - 'database or 
disk is full'


On Mon, Dec 7, 2009 at 10:21 PM,   wrote:

 SQLite bug report

 Summary:
 --

 error message:

   Error: near line 2: database or disk is full

 It happens with plenty of disk space available and with 'unlimited' database 
ize.
 It does not happen on all systems. It does not happen on small databases.

I run your scripts with SQLite 3.6.21 on Windows 7 64-bit on NTFS
rive and unfortunately I run out of disk space before the problem
anifested. Would it be possible for you to setup Process Monitor with
ilter on the database path (just the path, so both journal and the
ain database file are in the log) and history depth set to 1 million
the lowest value) and then capture the file accesses during the run
f the reproduction scripts? It is possible to save and export the
ata then for further analysis and hopefully it will give a clue on
hy it happens.
Best regards,
ilip Navara
__
qlite-users mailing list
qlite-us...@sqlite.org
ttp://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite bug report - large databases only - 'database or disk is full'

2009-12-07 Thread pirx

SQLite bug report
 
Summary:
--

error message:

   Error: near line 2: database or disk is full

It happens with plenty of disk space available and with 'unlimited' database 
size.
It does not happen on all systems. It does not happen on small databases.

 
Details:
--


The error occurs on 
- windows server 2003 SP2 64bit, windows server 2009 64bit
- windows vista, windows 7 rc1, windows 7 - all 64-bit

The error does _not_ occur on 
- windows server 2003 SP2 32bit, windows xp sp3 32 bit

SQLite3 versions affected: all the ones released in 2009, possibly all the 
earlier ones, too.


The bug was 'distilled' with the reproduction described below. 



Reproduction
--

We (1) create a table in the database, (2) create an input file; (3) import the 
input file into the table mulltiple times.

Using the files attached below the error did occur during the third/fourth 
import, at database size around 22-28gb.



Additional information and observations


1. This is not a new bug - it's been around for the last year, probably two - 
but it was difficult to spot and isolate.

2. The problem is 'touchy' - with different random seed in the generated table 
the problem may occur at 45gb db size instead of 20-30gb. When I tried to use, 
as input data, the same line repeated millions of times - the problem did not 
occur at all - tests were aborted by me at 160-260 gb database size.

3.The problem does _not_ depend _purely_ on the data being imported. The 
reproduction script loads the same data set multiple times and fails on the 
third/fourth time - so it is the database size which triggers it.

4. When the input file is smaller (1gb instead of 8gb), the problem still 
occurs in the 20-30gb database size range - so there is nothing magical about 
the number of imports.

5. The test database here is created with pragma page_size = 32768. The same 
error message occurs for other page sizes and at various cache sizes. 

6. In some other tests (not using to the scripts here, with different data) the 
import was good (about 20gb size) but an attempt to create an index on the 
imported data resulted in the same error diagnostic - after about 30mins of 
running.

7. This is not an SQLITE3 problem - when using .NET wrapper for SQLite or using 
any of the admin tools - the problems occurs in the same area, even though 
messages are not reported the same way by these tools [and I am getting 
sometimes 'database image in malformed' after such tools].

 
 


 
Scripts to reproduce the bug
--

 
++
== awk program to produce the table - save as: t1b.w
BEGIN {
srand(13) # assure all tests have the same data
for (i=0;i inpb.txt
 
echo - load table with data sets the first 10 times
time /T
for %%i in (1,2,3,4,5,6,7,8,9,10) do c:\apps\sqlite3 tstb.db http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread pilot pirx
Thank you and AndrewP for the pointers - very interesting read. 

Some clarifications. Coming from my perspective - that of R user, 
not an API user - it is natural to do more 
complicated operations in R than to try to write additional
functions in C/tcl etc. I do not know how common will such
attitude be among SQLite users not using R. (In general R
users tend to be statisticians with little knowledge of SQL.)
This is why my mail was not an attempt to propose some
extensions to core SQLite [it is so good for the intended
purpose that adding more stuff can spoil it :-)]

Were any additions for computing considered for SQLITE 
it seems plausible to argue that the should be easy to
implemenent and do not change the spirit of the project.
So more like 'adding a log function' rather than
something (probably) much bigger like Oracle olap.

It could be further argued that, for all such ideas for
SQLite extentions, the strategic approach of having 
'the core sqlite' and 'the extended sqlite' is the
best way to proceed. Any non-conventional extensions
(like log or OLAP) could be implemented in the 'extended
sqlite' version, used from there - and eventually migrated to 
the core if there is a sufficient interest.










- Original Message -
From: [EMAIL PROTECTED]
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] windowing functions != recursive functions
Date: Wed, 12 Oct 2005 22:48:35 -0400

> 
> "pilot pirx" <[EMAIL PROTECTED]> wrote:
> 
> > Now, for the recursive function like exponential moving average 
> > the defintion is that
> >
> > ema(i+1) =  val(i) * coef  + ema(i) * (1-coef).
> >
> > That is I have to know the previous value of both EMA _and_  
> > VALUE (while for moving avearage I need to know _only_ the 
> > previous value(s)
> > of VALUE.
> 
> You could write an "ema()" function for SQLite using the
> scarcely documented API functions sqlite3_get_auxdata() and
> sqlite3_set_auxdata().  (Those routines were intended to allow
> functions like "regexp" to compile a constant regular expression
> once and then reused the compiled regular expression on
> subsequent calls.  But they have never been used for anything,
> as far as I am aware.)
> 
> The ema() function would work like this:
> 
> SELECT ema(x, 0.45) FROM table1;
> 
> Where 0.45 is the "coef".
> 
> I was wondering if it would be possible to write a "prev()"
> function that returned the value of a column in the result
> set from the previous row.  prev() could be used to implement
> ema() in pure SQL.  But, alas, I do not see how you could
> write a general-purpose prev() function using the current
> API.  Some kind of extension would be required, I think.
> --
> D. Richard Hipp <[EMAIL PROTECTED]>


-- 
___
Play 100s of games for FREE! http://games.mail.com/



[sqlite] wiindow functions and recursive functions

2005-10-12 Thread pilot pirx
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?

that would require writing in C, bindind etc. Or, for some other databases,
writing in some stored procedure language. Here I have that with
stadard-looking sql in a small database. One 'quirk' of the implementation
(i am not sure it was intended) gives enormous additional
programming facility without any additional work :-)

> > Vastly more useful for moving average and the like would be real
> > windowing/grouping functions, like Oracle's "analytic" functions.  I'm

Example of computing moving average with standard sql is in one of my earlier 
mails.
I think that adding OLAP functions to such a small engine would be an overkill,
especially as most such functionality can be expressed with standard SQL
(admittedly, convoluted a bit).
The owner of the project will ultimately decide. 
It may not fit into the 'lite' 
image this project is after.


P.S. I remember seeing a public domain code adding 
correlation and regression to SQLite, but it was
about 3 years old - it was rather short. Yet, even that
did not make it into the mainstream project. 

- Original Message -
From: "Andrew Piskorski" <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 08:34:02 -0400

> 
> On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite...
> 
> > While using SQLite for some time (with R package, www.r-project.org)
> > I did admire its functionality and speed. Then I did discover a
> > hidden SQLite feature of immense usefulness - not available in other
> > databases. SQLite can compute Fibonacci numbers! (I will explain why
> 
> Transaction visibility features do vary, although often it doesn't
> matter anyway.  E.g., here's a dicussion of how (at least as of early
> 2004), PostgreSQL's docs were quite confused about certain subtleties,
> but what I find interesting, is this was still something that in
> practice had never really mattered to the mostly hard-core RDBMS
> programmers talking about it in that thread:
> 
>http://openacs.org/forums/message-view?message_id=176198
> 
> > UPDATE fib SET
> > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > WHERE pos > 2;
> 
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?
> 
> > This is an _immensely_ useful functionality when one needs to
> > compute various recursive functions. For example exponential moving
> > average, used frequently in financials. Or Kalman filter (and many
> 
> Vastly more useful for moving average and the like would be real
> windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> not thrilled by their particular syntax, but the functionality is
> INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> obviously better syntax, either.)
> 
> Hm, an amendement to the SQL:1999 spec added windowing support, and
> SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
> of Oracle was the author of that SQL spec, and IBM also supported it,
> so the SQL:2003 syntax and behavior is probably very similar (maybe
> identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
> PostgreSQL, as of 8.0, doesn't support it yet.
> 
>http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html
>http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf
>http://www.wiscorp.com/sql/SQL2003Features.pdf
>http://troels.arvin.dk/db/rdbms/#select-limit-offset
>http://www.postgresql.org/docs/8.0/interactive/features.html
>http://en.wikipedia.org/wiki/SQL
>http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf
>http://www.oracle.com/oramag/oracle/01-jul/o41industry.html
> 
> SQLite basically supports just SQL-92, it doesn't have any of these
> newer SQL:1999 or SQL:2003 features, right?
> 
> Using SQLite in conjunction with a powerful statistical data analysis
> programming language like R is an excellent example of a use where
> windowing functions can be hugely helpful.  Unfortunately, I've never
> had a compelling need to use SQLite for that, otherwise I'd probably
> take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> 
> --
> Andrew Piskorski <[EMAIL PROTECTED]>
> http://www.piskorski.com/


-- 
___
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm



[sqlite] windowing functions != recursive functions

2005-10-12 Thread pilot pirx
I was unaware of the windowing functions discusssion.
Having a look at the first link, it looks like we may be
talking about two subtly different issues.
The windowing functions described in the link
are different from recursive functions. 
There is no problem in computing moving average
or cumulative sum etc 
with the existing SQL (or dealing with time windows
in general)  - it just the SQL gets nasty - examples
are in 'SQL for smarties' by Joe Celko. Therefore
adding such functions to SQLite is 'nice' but does
not really increase the functionality.

In contrast, computation of the recursion function
adds it. 

example with moving averages, since they
are mentioned in the windowing functions

assuming original series is
VALUE = 1 2 3 4 5 6 7

following statement will compute 5-day moving average for the whole column
UPDATE data SET 
MAVE5 = (SELECT AVG(val) FROM data AS h1 WHERE h1.dayno <=  data.dayno 
AND h1.dayno > data.dayno-05);
But this statement operates always on existing data in the existing column - to 
compute new value of MAVE5 it only needs to know values of VALUE. (When, for 
the element one, there is not enough data (because there is no dayno < 1)  SQL 
simply averages over existing data.)

Now, for the recursive function like exponential moving average the defintion 
is that
ema(i+1) =  val(i) * coef  + ema(i) * (1-coef). That is I have to know the 
previous value of
both EMA _and_  VALUE (while for moving avearage I need to know _only_
the previous value(s) of VALUE. 


- Original Message -
From: "Andrew Piskorski" <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 08:34:02 -0400

> 
> On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > Subject: [sqlite] Please, please do _not_ remove this feature from SQLite...
> 
> > While using SQLite for some time (with R package, www.r-project.org)
> > I did admire its functionality and speed. Then I did discover a
> > hidden SQLite feature of immense usefulness - not available in other
> > databases. SQLite can compute Fibonacci numbers! (I will explain why
> 
> Transaction visibility features do vary, although often it doesn't
> matter anyway.  E.g., here's a dicussion of how (at least as of early
> 2004), PostgreSQL's docs were quite confused about certain subtleties,
> but what I find interesting, is this was still something that in
> practice had never really mattered to the mostly hard-core RDBMS
> programmers talking about it in that thread:
> 
>http://openacs.org/forums/message-view?message_id=176198
> 
> > UPDATE fib SET
> > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > WHERE pos > 2;
> 
> I don't see why this is such a great feature.  Without it, worst case,
> you could still write a simple little loop which would issue one
> update statement for each row, all within a single transaction.  No?
> 
> > This is an _immensely_ useful functionality when one needs to
> > compute various recursive functions. For example exponential moving
> > average, used frequently in financials. Or Kalman filter (and many
> 
> Vastly more useful for moving average and the like would be real
> windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> not thrilled by their particular syntax, but the functionality is
> INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> obviously better syntax, either.)
> 
> Hm, an amendement to the SQL:1999 spec added windowing support, and
> SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
> of Oracle was the author of that SQL spec, and IBM also supported it,
> so the SQL:2003 syntax and behavior is probably very similar (maybe
> identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
> PostgreSQL, as of 8.0, doesn't support it yet.
> 
>http://www.wintercorp.com/rwintercolumns/SQL_99snewolapfunctions.html
>http://www.ncb.ernet.in/education/modules/dbms/SQL99/OLAP-99-154r2.pdf
>http://www.wiscorp.com/sql/SQL2003Features.pdf
>http://troels.arvin.dk/db/rdbms/#select-limit-offset
>http://www.postgresql.org/docs/8.0/interactive/features.html
>http://en.wikipedia.org/wiki/SQL
>http://www.sigmod.org/sigmod/record/issues/0403/E.JimAndrew-standard.pdf
>http://www.oracle.com/oramag/oracle/01-jul/o41industry.html
> 
> SQLite basically supports just SQL-92, it doesn't have any of these
> newer SQL:1999 or SQL:2003 features, right?
> 
> Using SQLite in conjunction with a powerful statistical data analysis
> progr

Re: [sqlite] SQL Window/OLAP functions

2005-10-12 Thread pilot pirx
I did use R extensively for years and with SQLite for the last year. 
My observations, for what they are worth, about statistics and databases.

A stats package, especially as powerful as R, makes any database functions
less relevant. After trying various approaches I use now the database mostly 
 - to reduce data somewhat before getting them to R. 
   For example, with 2 mln records in 500 groups I can compute group 
   averages in the database and read into R only 500 records. 
- to reduce overall memory requirements - I can process the whole 
  data set on group-by-group basis - reading one group at a time.
  thus requiring less memory internally

Any stats package will have many functions, most of them impossible to
implement in standard SQL and very difficult to implement in general.
(like clustering etc).

So why do I write filters in the database, instead of using vastly superior 
R capabilities? Firstly, for fun - just liked to push SQL as far as possible.
Secondly - it is sometimes useful to compute some basic things during
or immediately after data acquisition. If we can compute some 
_simple_ metrics at that stage _quickly_ then R does not have to
read and process that much, as only data withing some range of metric
may be of interest. Also, we can index on a metric and provide very
fast extraction of data subsets.

In summary it seems to me that adding too heavy functions to any database
may be difficult and, for SQLite, going against the basic idea of 
having a 'lite' db. 






- Original Message -
From: Laurent <[EMAIL PROTECTED]>
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] SQL Window/OLAP functions
Date: Wed, 12 Oct 2005 15:36:22 +0200

> 
> Hello,
> 
> I was just looking for a statiscal package linked with SQLITE.
> >
> > Using SQLite in conjunction with a powerful statistical data analysis
> > programming language like R is an excellent example of a use where
> > windowing functions can be hugely helpful.  Unfortunately, I've never
> > had a compelling need to use SQLite for that, otherwise I'd probably
> > take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)
> >
> I can confirm that there would be some interest in having such a library.
> 
> Best regards,
> 
> Laurent.
> 
> ==
> 
> - Original Message -
> From: "Andrew Piskorski" <[EMAIL PROTECTED]>
> To: <sqlite-users@sqlite.org>
> Sent: Wednesday, October 12, 2005 2:34 PM
> Subject: [sqlite] SQL Window/OLAP functions
> 
> 
> > On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> > > Subject: [sqlite] Please, please do _not_ remove this feature from
> SQLite...
> >
> > > While using SQLite for some time (with R package, www.r-project.org)
> > > I did admire its functionality and speed. Then I did discover a
> > > hidden SQLite feature of immense usefulness - not available in other
> > > databases. SQLite can compute Fibonacci numbers! (I will explain why
> >
> > Transaction visibility features do vary, although often it doesn't
> > matter anyway.  E.g., here's a dicussion of how (at least as of early
> > 2004), PostgreSQL's docs were quite confused about certain subtleties,
> > but what I find interesting, is this was still something that in
> > practice had never really mattered to the mostly hard-core RDBMS
> > programmers talking about it in that thread:
> >
> >   http://openacs.org/forums/message-view?message_id=176198
> >
> > > UPDATE fib SET
> > > val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
> > >(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> > > WHERE pos > 2;
> >
> > I don't see why this is such a great feature.  Without it, worst case,
> > you could still write a simple little loop which would issue one
> > update statement for each row, all within a single transaction.  No?
> >
> > > This is an _immensely_ useful functionality when one needs to
> > > compute various recursive functions. For example exponential moving
> > > average, used frequently in financials. Or Kalman filter (and many
> >
> > Vastly more useful for moving average and the like would be real
> > windowing/grouping functions, like Oracle's "analytic" functions.  I'm
> > not thrilled by their particular syntax, but the functionality is
> > INCREDIBLY useful.  (And on the other hand, I haven't thought of any
> > obviously better syntax, either.)
> >
> > Hm, an amendement to the SQL:1999 spec added windowing support, and
> > SQL:2003 includes that, I think as features T611, "Elementrary OLAP
> > functions" and T612, "Advanced OLAP functions". 

[sqlite] Please, please do _not_ remove this feature from SQLite...

2005-10-12 Thread pilot pirx
While using SQLite for some time (with R package, www.r-project.org) I did 
admire its functionality and speed. Then I did discover a hidden SQLite feature 
of immense usefulness - not available in other databases. SQLite can compute 
Fibonacci numbers! (I will explain why this is important later).

The are defined as follows: F1 = 1, F2 = 1; F3 = F1 + F2; F4 = F2 +F3  etc. So 
they are defined recursively, with the first two values known and each next 
value depending on the previous values. The SQL statement for that is (the full 
script at the end of this mail):

UPDATE fib SET
val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
   (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
WHERE pos > 2;

Now, in the standard SQL that should result in: 1,1,2,null,null,null... - 
because the assumption is that the operations on all rows are done at the same 
time.  So F4 will get null, because F3 was null etc. All other databases seem 
to do it - I tested HSQLDB, Firebird, SQLServer, Derby. 

Fortunately (for me) SQLite apparently stores rows after computation and the 
previous row's value is available when the next row is being computed. Also, 
the operations are executed in row order.  So I do get 1,1, 2, 3, 5, 8 ... 
(though some table/index declarations have to be right to achieve that for 
larger tables).

This is an _immensely_ useful functionality when one needs to compute various 
recursive functions. For example exponential moving average, used frequently in 
financials. Or Kalman filter (and many other filters) used in data smoothing 
and analysis.  Naturally, one could argue that databases are not for numerical 
computations. But it is very useful to be able to do simple computations in 
database,  especially if one does not have to write stored procedures or write 
external procedures for that - in essence getting something for free and 
without added complexity. So I hope that this feature will stay...


P.S. On somewhat related note: since sqlite is written in C - why it does not 
expose some basic functions from the standard C library (log, exp, sqrt, sin), 
at least optionally? Understandably, the idea is to keep it 'lite'. But, may 
be, an approach similar to ant and other packages could be applied to SQLite - 
that is there is a set of standard (but still simple) extensions, including 
things which may add some bulk, but do not require any large implementation 
effort.


= the test

DROP TABLE fib;
CREATE TABLE fib ( pos INTEGER, val INTEGER);
CREATE UNIQUE INDEX fib_ix ON fib(pos);
INSERT INTO fib VALUES (1,1);
INSERT INTO fib VALUES (2,1);
INSERT INTO fib VALUES (3,NULL);
INSERT INTO fib VALUES (4,NULL);
INSERT INTO fib VALUES (5,NULL);
INSERT INTO fib VALUES (6,NULL);
INSERT INTO fib VALUES (7,NULL);
INSERT INTO fib VALUES (8,NULL);
INSERT INTO fib VALUES (9,NULL);
INSERT INTO fib VALUES (10,NULL);
INSERT INTO fib VALUES (11,NULL);
INSERT INTO fib VALUES (12,NULL);
INSERT INTO fib VALUES (13,NULL);
INSERT INTO fib VALUES (14,NULL);
INSERT INTO fib VALUES (15,NULL);
INSERT INTO fib VALUES (16,NULL);
INSERT INTO fib VALUES (17,NULL);
INSERT INTO fib VALUES (18,NULL);
INSERT INTO fib VALUES (19,NULL);
INSERT INTO fib VALUES (20,NULL);
UPDATE fib SET -- compute fibonacci numbers
val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
   (SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
WHERE pos > 2;

select * from fib; -- show the results



-- 
___
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm