Re: [sqlite] Embedded Database Functions

2006-12-24 Thread Andrew Piskorski
On Sun, Dec 24, 2006 at 09:35:01AM -0600, John Stanton wrote:

> >There is even a starting grammar for you:
> >

> A compiler for a subset of PL/SQL would not be too arduous a project, 

If what you want is something like PL/SQL, it might be both easier and
better to retarget PostgreSQL's PL/pgSQL for SQLite opcodes, rather
than starting from scratch with your own Oracle PL/SQL style language.
(And PL/pgSQL's BSD license is compatible with SQLite's.)

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, send email to [EMAIL PROTECTED]

[sqlite] deductive relational databases Re: [sqlite] selecting rows of the view via its position

2006-07-24 Thread Andrew Piskorski
On Mon, Jul 24, 2006 at 09:13:28AM -0700, hanno wrote:
> Subject: Re: [sqlite] selecting rows of the view via its position

> My real problem is, that I do all that stuff for my diploma's thesis in
> computerscience. I'm trying to show the abilities of relational datamodels
> in a deductive context. I'm trying to implement a rule base in sql that
> shall provide correct and complete analysis of chess situations as an
> example for deductive database driven monitoring.

Well, I'm no expert on this, but keep in mind that "Standard SQL" !=
"relational model".  They are related, but not necessarily the same.
(In many of his writings, C.J. Date complains in detail that SQL is
not properly relational.)

What you're trying to do sounds related to the (older) research work
on "deductive databases", e.g., Aditi.  All the various forward and
(particularly) backward chaining systems are probably at least
peripherally relevent.  E.g., the Prolog, Mercury, and Oz languages;
the Oz/Mozart and CLIPS toolkits; and various others too.

Also, (as far as my limited understanding of they underlying theory
goes) the relational model is fundamentally un-ordered.  I'm told
there is some academic work out there on other, more powerful models
which natively understand order.  I think it would make more sense to
call that a "vector relation model", but apparently what it's actually
called is a "set model"!  I have not read it, but this book seems to
cover that subject:

  "The Set Model for Database and Information Systems"
  by Mikhail M. Gilula

Some of the gurus of the vector-oriented languages (APL, K, J, A+,
etc.), particularly the K folks, have implemented relational-like
database systems.  I strongly suspect they grok this un-ordered
vs. ordered distinction, and are taking advantage of their vector
languages' native understanding of and optimization for ordered data.
I don't know whether or not that's related to Gilula's "set model"
above, though.

Ordered data is perhaps also related to column-store (rather than the
more usual row-store, like SQLite, Oracle, etc.) databases.  MonetDB
is one such (and OpenSource) column-store database.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] selecting rows of the view via its position

2006-07-24 Thread Andrew Piskorski
On Mon, Jul 24, 2006 at 09:13:28AM -0700, hanno wrote:
> Hello Jay!
> Assigning the ids outside sql wouldn't be neccessary because I could
> easily see the predecessor and successors. You are completely right,
> when you suggest a sql-external solution.

Do you mean that, you would like to do a query with an order by, and
then in each row for that same query, easily refer to columns from
OTHER (either earlier or later) rows in the ordered result set?

If so, what you want are the very useful SQL:2003 "OLAP" functions for
dealing with ordered data.  These include "lead" and "lag", among
others.  But, SQLite does not support that.  Oracle and DB2 do.
FYI, here are some links to further discussion of the SQL:2003 OLAP

  "SQL:2003 OLAP/windowing functions?", 2006-01-25

  "SQL Window/OLAP functions", Wed, 12 Oct 2005

  "Re: [sqlite] Slow SQL statements", Mon, 23 May 2005

  "Re: [sqlite] windowing functions != recursive functions",  Thu, 13 Oct 2005

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] sqlite3_free()

2006-06-27 Thread Andrew Piskorski
On Tue, Jun 27, 2006 at 04:14:37PM +0100, Christian Smith wrote:

> Anyway, it's not difficult to provide thread local storage. HP-UX's 
> netdb.h functions (gethostbyname etc.) are fully re-entrant despite 
> returning 'static' data, for example. Other UNIXs got hamstrung with 
> various getXbyY_r implementations, with horrible semantics.

Well yes, the *_r functions are often pretty ugly to use.  But they
work great if what you want to do is build your own thread local
storage version on top!

I've always assumed there's some good reason for the existence and use
of *_r functions rather than equivalent thread local storage versions,
although I've never been sure just what it is.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] SQLite Vs VistaDB - Comparison ???

2006-06-16 Thread Andrew Piskorski
On Fri, Jun 16, 2006 at 12:35:33PM -0400, [EMAIL PROTECTED] wrote:

> > Out of curiosity why won't flock() work?
> Process A wants to modify the database, so it flock()s
> the rows it needs to changes and starts changing them.
> But half way in the middle of the change, somebody sends
> process A a SIGKILL and it dies.  The OS automatically
> releases the flocks as process A dies, leaving the
> database half-way updated and in an inconsistent state,
> with no locks.
> Process B comes along and opens the database, see the
> inconsistent state, and reports database corruption.

Would it, at least in principle, be feasible to have Process B then
take a lock (hm, which lock?), notice somehow that A's transaction
failed without either committing or rolling back, read the rollback
journal written earlier by Process A, and rollback A's half-done work?
What in practice makes that not a good idea?

Would using a non-overwriting MVCC storage layer a la PostgreSQL (but
still using client SQLite processes only, no client/server
arrangement) make any of the above easier or better?

Note, I'm not suggesting that you should implement anything like this
in SQLite, I'm just curious in general...

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Memory DB: Load from file

2006-06-05 Thread Andrew Piskorski
On Mon, Jun 05, 2006 at 12:05:52PM -0700, Dave Gierok wrote:

> Is there a way to load a Sqlite file based DB and then specify we want
> that to go into memory?  Or is there a more efficient way to create our
> memory DB?

You could use attach to copy from an on-disk SQLite database:

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] SQLite as R data store

2006-05-26 Thread Andrew Piskorski
As an R user, I'm happy to see the project below, it could become a
very handy use of SQLite.  

I suspect it's well out of scope for that project, but it would be
particularly cool to eventually see some of the integration go the
other way, and allow use of SQLite's SQL engine to manipulate R data
frames (which are tables, basically) from within R, as that would be
much more powerful than R's rather limited set of data frame
manipulation functions.

- Forwarded message from Douglas Bates <[EMAIL PROTECTED]> -

From: "Douglas Bates" <[EMAIL PROTECTED]>
Date: Thu, 25 May 2006 11:16:12 -0500
Subject: [Rd] R Project accepted for the Google Summer of Code

We are very pleased that an R project has been selected as one of the GNU
projects for the Google Summer of Code 2006 (

Miguel Angel R. Manese, an M.S. Statistics student at the University of
the Philippines, will be working with Douglas Bates and Brian Ripley on a
project entitled `SQLite Data Frames for R' to let R store very large
datasets in an SQLite database transparently, implementing primitive
operations for data stored in SQLite so that they behave exactly like
ordinary data frames to the users.  It is likely that the project will
result in the first instance in an R package (which may in due course
become part of the tarball).

Congratulations, Miguel!

- End forwarded message -

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] updating SQLite to implement The Third Manifesto

2006-03-10 Thread Andrew Piskorski
On Fri, Mar 10, 2006 at 04:37:36PM -0800, Darren Duncan wrote:

> 3.  There is no such thing as a NULL.
> 3.1  All logic is 2VL (true, false) not 3VL (true, false, unknown).

There is no such thing as null, really?  So, when you do an outer join
between two tables, which in SQL would produce null columns in the
result set, what do YOU propose producing instead of those nulls?

Perhaps I missed it, but in my brief reading of some of Date's work, I
never saw him answer that question.

> 3.4  Missing data can be either represented with the data type's 
> empty value, or a table column that may possibly be unknown can be 
> split into a separate related table, that only has records when the 
> value is known.
> 3.5  All variables default to a reasonable valid value for their type 
> if not explicitly set, such as the number zero or the empty string.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] performance statistics

2006-03-01 Thread Andrew Piskorski
On Wed, Mar 01, 2006 at 10:53:12AM -0500, [EMAIL PROTECTED] wrote:
> If you use READ COMMITTED isolation (the default in PostgreSQL)

> If it is a problem,
> then you need to select SERIALIZABLE isolation in PostgreSQL
> in which case the MVCC is not going to give you any advantage
> over SQLite.

Is that in fact true?  I am not familiar with how PostgreSQL
implements the SERIALIZABLE isolation level, but I assume that
PostgreSQL's MVCC would still give some advantage even under
SERIALIZABLE: It should allow the readers and (at least one of) the
writers to run concurrently.  Am I mistaken?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Managing trees in the database

2006-02-19 Thread Andrew Piskorski
On Sat, Feb 18, 2006 at 05:34:45PM -0800, Jim Dodgen wrote:

> I really miss the oracle "connect by" operator. I first used it for a 
> postal application back in 1992. I am surprised that this feature has 
> not made it into to the standard or any other RDBMS.  Maybe  we  should 

Although useful, Oracle's connect by feature is widely considered to
be a flawed design, which is part of the reason few other databases
support it.  The SQL standard specifies a different approach, which is
said to be similar to DB2's "recursive SQL".

Hm, here are a bunch of links on "Hierarchical data in RDBMSs", which
I just stumbled across:

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Managing trees in the database

2006-02-17 Thread Andrew Piskorski
On Thu, Feb 16, 2006 at 09:22:13PM -0800, SanjayK wrote:
> Fascinating! I got another reply too by email pointing to the book by Joe
> Celko.

Also try searching the SQLite archives, e.g.:

There are probably many other relevent posts as well.  (Unfortunately,
the search functions on the existing mailing list archives seem to be
rather poor, you can search for "Hierarchical", "connect", or "by",
but not for the phrase "connect by".)

I'm not really familiar with that "Tree Sortkey" technique, but here
are a few more links about it, (which in turn include some links to
other techniques as well):

I think another name for the basic "Tree Sortkey" technique is the
"m-vgID method" as described by Miguel Sofer:

Reportedly (according to Dan Wickstrom in 2001, who I'm pretty sure
knew what he was talking about):

  "This method provides the flexiblity of the nested set model, and it
  doesn't suffer from the same performance problems on insert/update
  operations. In addition, the implementation is simpler."

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Possible feature request 'Raw Row' representation

2006-02-17 Thread Andrew Piskorski
On Thu, Feb 16, 2006 at 06:09:15PM -0800, w b wrote:

> At the moment when I pull back the row. I package it up in to a
> network byte order representation and send it on its merry way. The
> recv's of course unpack the message and then refresh their local
> information with the information sent, again in a SQLITE DB having
> the same table structure as what it originated from.
> So I was thinking (Dangerous with out too much coffee!)  if the DB
> format is platform neutral is there a way to obtain a raw image of
> that row from SQLITE rather than performing the sqlite3_column_XXX
> for each field returned.

That might be an interesting or useful technique for other uses as
well, say if you were trying to make clusters of SQLite engines act as
one consistent database.  Your "raw row" idea sounds at least
superficially related to what the old Postgres-R project did for
multi-master replication.

In Postgres-R, basically each node would process the SQL for a
transaction, then ship a much lower level representation of that
transaction to the other nodes, using one of the academic group
communicatins toolkits, either Spread or (I think) Ensemble.  Since
for many transactions applying this lower level representation was
much more efficient than processing the original SQL, this gave some

Bettina Kemme's paper from 2000, "Don't be lazy, be consistent:
Postgres-R, a new way to implement Database Replication." describes

Postgres-R was research-ware based on an ancient 6.x version of
PostgreSQL (from before MVCC!), and AFAIK no one ever drummed up
enough interest and manpower to actually finish forward porting it to
a more recent PostgreSQL version.  But it seemed interesting.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] :memory: and sessions with PHP

2006-02-14 Thread Andrew Piskorski
On Wed, Feb 15, 2006 at 08:31:17AM +1300, CrazyChris wrote:
> We may be at crossed paths...  I'm wanting to save the :memory: database to
> the session, not the other way round, so that when the 2nd page loads, the
> :memory: database can be recreated and available as it was on the last page
> load.

What is a "session"?  I assume that's some special PHP or Apache
feature, but which?  How is a "session" implemented really?
Inter-process shared memory on Unix?  Process-wide memory in a
multi-threaded Apache 2.x build?  What? 

Having an in-memory database (or in-memory anything) persist across
multiple page hits in a web server is certainly feasible, but for
anyone not using the exact same tools as you, you probably want to
give more background - basically, provide a mapping between
tool-specific jargon like "session", and more widely understood
programming terms.

> The advantage is that after some time, the session is deleted
> automatically by the server and the database goes with it, so short term,
> high-intensity data can be stored and queried quickly in :memory: and the
> add/edits remain through the entire user experience.

I don't see why that is any real advantage.  Periodically purging an
in-memory database of old values should be rather trivial.

> An alternative is to create the :memory: database and populate it from
> session data each time, then save back to session on script close.

That sounds like a bizarre hack.  (But then I don't know what your
sessions really are, nor do I really understand your application
requirements, so perhaps I am missing something.)

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Simple ATTACH/memory database question

2006-02-14 Thread Andrew Piskorski
On Tue, Feb 14, 2006 at 10:16:14AM -0500, Kervin L. Pierre wrote:

> We are getting ready to start using in- memory database to cache
> sqlite reads/writes in effort to improve speed.  For background,

> Since we read much more than we write, reading from memory should
> improve.

Shouldn't SQLite's built-in cache for disk-backed databases already
accomplished that for reads?  Have you tested the actual performance?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] R: [sqlite] Snapshot database creation performance

2006-02-07 Thread Andrew Piskorski
On Tue, Feb 07, 2006 at 08:51:43AM -0500, Teg wrote:

> My application uses compressed data (gzip) but, the tradeoff to small
> data files is exceptionally heavy CPU usage when the data is
> decompressed/compressed.

Incidentally, the MonetDB folks have done research on that sort of
thing.  In their most recent project, "X100", they keep the data
compressed both on disk AND in main memory, and decompress it only in
the CPU cache when actually manipulating values.

They do that not primarily to save disk space, but to reduce the
amount of memory bandwith needed.  Apparently in some cases it's a big
speed-up, and shifts the query from being memory I/O bound to CPU
bound.  Of course, in order for that to work they have to use very
lightweight compression/decompression algorithms.  Gzip gives much
better compression, but in comparison it's extremely slow.

Probably not immediately useful, but it seems like interesting stuff:

  "MonetDB/X100 - A DBMS In The CPU Cache"
  by Marcin Zukowski, Peter Boncz, Niels Nes, Sandor Himan

Btw, apparently the current stable version of MonetDB is open source
but they haven't decided whether the X100 work will be or not.

Googling just now, there seems to have been a fair amount of research
and commercialization of this sort of stuff lately, e.g.:

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Snapshot database creation performance

2006-02-07 Thread Andrew Piskorski
On Tue, Feb 07, 2006 at 07:43:44AM -0500, Andrew Piskorski wrote:
> On Tue, Feb 07, 2006 at 12:52:08PM +0100, Clinco, Michele wrote:

> > My program is written in .Net and the compression routines I'm using are
> > stream based, so I need to create a memory stream from the internal
> > buffers that can be used by the compression routine...
> Do you mean you wish to directly access SQLite's in-memory data
> structures, rather than using a SQL query to get the data?  Why?

Oh, never mind, I wrote too soon.  I realize now that you want an
in-memory representation of a SQLite database file, which you than
gzip or whatever, all still in memory, and then transfer over the
network to whomever wants that file.  And you want to know if there's
some way to do that which offers more programatic control than the
ramdisk filesystem you're using now.

Essentially, you want a way to dynamically control the size of the RAM
disk.  But it sounds like you don't necessarily need ALL the
facilities of a normal file-system, so you're wondering if perhaps you
could implement something more purpose-specific yourself rather than
using that ramdisk driver.

Interesting, but I don't know the answer.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Snapshot database creation performance

2006-02-07 Thread Andrew Piskorski
On Tue, Feb 07, 2006 at 12:52:08PM +0100, Clinco, Michele wrote:

> The question is:
> Is there a way that allows to create the database in memory (this is
> present: use :memory: as file name) and to use the allocated memory
> before it is deallocated when database is closed?

I'm confused, why do you want to do that?  Use the memory how?

If you need a lot of memory for other purposes after you're done with
the SQLite database, what's wrong with just letting SQLite free the
memory, and then malloc'ing as much memory as you want?

> My program is written in .Net and the compression routines I'm using are
> stream based, so I need to create a memory stream from the internal
> buffers that can be used by the compression routine...

Do you mean you wish to directly access SQLite's in-memory data
structures, rather than using a SQL query to get the data?  Why?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] More benchmarks

2006-02-07 Thread Andrew Piskorski
On Tue, Feb 07, 2006 at 07:06:26AM +0100, Nemanja Corlija wrote:
> On 2/7/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:

> > Did you happen to do an analyze?
> Nope. All databases are run as default as possible. And, they all get
> same scripts to execute.

Then your results for PostgreSQL are utterly meaningless.  (And in
this case, the poor performance reflects poorly on you, the DBA, not
on PostgreSQL.)

> > What changes have you made to the default postgresql.conf?
> None.

Then your test results are bogus.  Last I heard, the default value in
postgresql.conf were intended to simply work AT ALL on the widest
possible range of hardware, operating systems, etc., and are NOT
recommended values for any actual production use.

Yes, I that sounds very foolish of the PostgreSQL folks to me too, but
there you have it.  Using PostgreSQL properly REQUIRES that you modify
those settings.

> Sure, I could do that. But then I'd also need to tune all other
> databases to make things fair and that's not really what I intended to
> do here. I want to keep things as "out of the box" as possible.

The above is not exactly "tuning", it is basic "Running PostgreSQL
101" type stuff.

Look at it this way: Different databases have different installation
requirements.  Editing postgresql.conf and collecting statistics with
vacuum analyze are simply part of the required install procedure for
PostgreSQL.  If you don't do the basic stuff like that, your database
is simply misconfigured, and any performance results you generate are
worthless - because in the real world, NO ONE with any clue at all
would ever run their database that way.

Minimally, you need to install and configure each of the databases
you're benchmarking in the manner expected of a competent but
non-expert user of that tool.  Naturally this various for different

If you find the process of properly installing and configuring the
database software overly complicated or poorly documented, then that's
a perfectly legitimate complaint, but it has nothing to do with
performance benchmarking.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] proposal for improving concurrency in SQLite

2006-01-07 Thread Andrew Piskorski
On Sat, Jan 07, 2006 at 12:33:08AM -0500, Shailesh N. Humbad wrote:
> I wrote an article suggesting a method for improving concurrency in 
> SQLite.  The main reason I wrote this article is so I can understand 
> SQLite better, not because my proposal is new or usable.  I am not an 
> expert in SQLite or databases, so I would appreciate any comments or 
> criticisms.  Here is the link:

Your "Prior SQLite Concurrency Proposals" section is deficient, I
recommend a lot more searching in the SQLite list archives.  Read at
least the threads which contain these messages:

Particularly interesting were Doug Currie's April 2004 ideas for
achieving MVCC semantics with table level locking for writes, by
integrating shadow paging into the BTree layer:

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] uSQLiteServer Source code available

2005-11-12 Thread Andrew Piskorski
On Sat, Nov 12, 2005 at 05:24:47PM -0700, [EMAIL PROTECTED] wrote:


> 4) Am I doing the right thing? Of course I think the uSQLiteServer is
> the best thing since sliced bread, but then it was designed to meet my
> criteria :-) OTOH reception has been mixed. I have had a couple of nice
> replies on this list but overall feedback has been lukewarm to icy. I

Well, it can't possibly be a BAD thing to have your uSQLiteServer code
out there and available, so I fully encourage you to keep at it!  :)
Your focus on simplicity and easy of writing client code could
certainly be valuable to many people.  I suspect the lukewarm reaction
is more that the people who might most benefit from your project just
haven't noticed it yet.

What might help interest and uptake, is to give us more detail on why
YOU found it so useful.  For example, in your use of uSQLiteServer on
embedded systems, why didn't you just use an existing client-server

You needed something smaller footprint?  You were using a weird client
platform for which there was no working PostgreSQL client code at all,
and you wanted something simpler to implement?  You had to also run
the database server on a very resource constrained embedded system,
rather than a general purpose server box?  Or?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] CHECK constraints

2005-11-02 Thread Andrew Piskorski
On Wed, Nov 02, 2005 at 06:30:51PM -0500, [EMAIL PROTECTED] wrote:
> In a CHECK constraint, if the expression is NULL (neither true
> nor false) does the constraint fail?

At least in Oracle, no, your example insert works fine.  If you want
the insert to fail, you need to add a "not null" constraint as well.
Nulls are always allowed unless you have a not null constraint.

I'm not certain, but I believe this is standard behavior in all SQL
RDBMSs that support constraints, not just Oracle.

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] type confusion

2005-11-02 Thread Andrew Piskorski
On Wed, Nov 02, 2005 at 10:45:54AM -0600, Jay Sprenkle wrote:
> Subject: Re: [sqlite] Proposed 3.3.0 changes. Was: 5/2==2

> > So don't make the field 10 bytes long, make it only 8.  SQLite won't
> > care a bit, and will give you the value in whatever format you want.
> Then it's not type agnostic any more. You now have an 8 byte numeric
> and a 10 byte numeric. Which is no different than integer and real.

Wrong, and obviously so.  I mean really, how many bytes LONG a value
is must DETERMINE whether it represents an integer or a floating point
number?  Must?  In what bizarre alternate universe is that true?

> > From SQLite's standpoint it is agnostic.  SQLite neither knows nor cares
> > what is actually stored in the column; that's up to your application to

> The only way for this to work will be to remove all mathematic
> operations.  You can't make it agnostic of types if you have more
> than one type and allow operations to be performed on the types.

Again wrong.

(Note that deciding to do math on values, even if you do it via the
"+" operator in a SQL query, *IS* part of the application.  It's
certainly not part of the data storage layer, at any rate.)

Jay, it's painful to see you put your foot in your mouth over and over
again.  Please learn enough so that you stop sticking it in there.

E.g., Tcl can be reasonably described as type agnostic, yet it can do
math.  Since DRH is also a member of the Tcl Core Team, presumably Tcl
was a design influence on SQLite.  It might be useful to look at it
for comparison.

In Tcl, each function decides how to interpret its arguments, whether
as integers, strings, floats, or whatever.  But the Tcl runtime
storing, copying, or doing whatever else with those values basically
does NOT care about their "type".  You can assign any value to any
variable, etc.  Thus it makes sense to describe Tcl as "type
agnostic".  That seems to have certain parallels to SQLite's manifest

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] bignums and the numerical tower

2005-11-02 Thread Andrew Piskorski
On Wed, Nov 02, 2005 at 06:43:31AM -0600, John Stanton wrote:

> Subject: Re: [sqlite] Proposed 3.3.0 changes. Was: 5/2==2

> You misunderstood the comment.  A schoolchild expects to see 5/2 give 
> exactly 2.5, not an approximate representation of 2.5 in a floating 

> I question why we have to use binary integers based on the word size of 
> particular generations of computers when we are storing data.  We have 

Talk about feature creep...  None of that is on the table in
Dr. Hipp's recent proposed changes.  It also sounds likely to conflict
with one of Dr. Hipp's principal goals for SQLite, "simplicity" - and
he means of implementation, not just of use.  But, it might still be
worth discussing.

First of all, what is required to meet your goal?  (I'm not
sufficiently familiar with the issues to know off the top of my head,
but since you're interested in this...)  Is adding bignum support (via
GMP or any number of other such libraries) sufficient, or is more than
that needed to properly represent exact fractions, complex numbers,
and the like?

Tcl just integrated bignum support on the Head, which presumably will
ship with Tcl 8.5.  Does that support cover everything you want, or
not?  What about the "full numerical tower" provided by many Scheme

If intersection of this sort of numerics with databases interests you,
you may want to get in touch with Jean-Claude Wippler.  I don't recall
what if any plans he has for bignums and the like, but the vector
oriented nature of his Vlerq and Metakit projects makes it likely that
he does indeed have plans, or at least ideas, along those lines:

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Proposed 3.3.0 changes. Was: 5/2==2

2005-11-01 Thread Andrew Piskorski
On Tue, Nov 01, 2005 at 01:45:04PM -0600, Jay Sprenkle wrote:

> > > I think it's a bit misleading to call the library "SQL-Lite: if
> > > you're going to redesign and not be like SQL. What do you think
> > > about creating a separate project for a fast light database
> > > engine that's not "SQL Like"?
> >
> > SQLite is very much SQL.  It only deviates from the standard
> > to fix bugs in the original design of the standard.
> If you're going to become less compliant perhaps it would be less
> misleading to remove the "SQL" from the project name. I'm not saying
> either is a bad idea, just that the name shouldn't be misleading.

Your name "Jay" appears to derive from the Sanskrit "jaya", meaning
"victory".  I find this a bit misleading, as clearly anyone regularly
wasting time with suggestions like yours is unlikely to achieve
victory in anything.  I therefore suggest that you change your name
from "Jay" to something less misleading.

Just a thought.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Convert SQLite to Java

2005-10-24 Thread Andrew Piskorski
On Mon, Oct 24, 2005 at 09:58:53PM +0100, Steve O'Hara wrote:

> I've been worn down by my quest for a comparable product to SQLite
> written in pure Java and have come to the conclusion that it doesn't
> exit.  I've looked at Derby, QED, Axion, blah blah and I'm not too

> My next port of call is to look at the feasibility of converting the
> SQLite source into Java!  

Uh, why isn't your first choice simply, "Use SQLite from Java"?
Surely calling C libraries is something the Java folks have worked out
how to do many years ago?  Or what am I missing here?  What is the
attraction of "pure Java"?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] windowing functions != recursive functions

2005-10-13 Thread Andrew Piskorski
On Wed, Oct 12, 2005 at 09:05:26PM -0500, pilot pirx wrote:

> The windowing functions described in the link
> are different from recursive functions. 

Yes, I think you're right.  Your EMA example bugged me, so I fooled
with it, but I couldn't come up with any way to implement EMA using
plain SQL, even with the windowing/OLAP functions.

Some explanations of the EMA algorithm are here:

Looks like you don't even need real recursion for EMA (AKA, it is
tail-recursive), plain old iteration will do fine, as you can see from
the simple C++ code in the 2nd link above.  (Yet it can't be done in
SQL.  Well we all knew that SQL isn't Turing complete, here's a
practical consequence of that, I guess.)

> Now, for the recursive function like exponential moving average the
> defintion is that ema(i+1) = val(i) * coef + ema(i) * (1-coef). That
> is I have to know the previous value of both EMA _and_ VALUE (while
> for moving avearage I need to know _only_ the previous value(s) of

It is quite possible to return previous (lagged) values of multiple
columns, NOT just one.  You do that with dense_rank, which I talked
about a back on May 23 on this email list:

  I always thought of that feature as "look-aside", as in, "Find the max
  of foo, but don't give me that, instead look aside and return me the
  corresponding value of bar next to it instead."

max(playerid) keep (dense_rank last order by sb) as top_stealer
  from batting

The problem is that EMA needs to use the lag of the actual column it's
calculating, and there's just no damn way to do that in SQL.

Well, I didn't try it, but maybe it would be possible to kludge EMA
somehow using either Oracle's "connect by", or DB2's recursive SQL,
which sounds possibly and less hideous than connect by:

Here's the example I tried for EMA (on Oracle

create table atp_ema (symbol varchar2(8)  ,day date  ,val number); 
insert into atp_ema values ('a' ,'2005-01-03' ,67); 
insert into atp_ema values ('a' ,'2005-01-04' ,77); 
insert into atp_ema values ('a' ,'2005-01-05' ,80); 
insert into atp_ema values ('a' ,'2005-01-06' ,82); 
insert into atp_ema values ('a' ,'2005-01-07' ,94); 
insert into atp_ema values ('a' ,'2005-01-10' ,81); 
insert into atp_ema values ('a' ,'2005-01-11' ,83); 
insert into atp_ema values ('b' ,'2005-01-03' ,27); 
insert into atp_ema values ('b' ,'2005-01-04' ,27); 
insert into atp_ema values ('b' ,'2005-01-05' ,20); 
insert into atp_ema values ('b' ,'2005-01-06' ,22); 
insert into atp_ema values ('b' ,'2005-01-07' ,24); 
insert into atp_ema values ('b' ,'2005-01-10' ,21); 
insert into atp_ema values ('b' ,'2005-01-11' ,23); 
variable coeff number; 
begin  :coeff := 0.66;  end; 

-- Correct algorithm, but can't run: 
select  v.* 
  ,(ema_term_1 + ema_term_2)  as ema 
  ,(avg(val) over (partition by symbol  order by day 
  rows between 3 preceding and current row))  as mean_4_day 
from ( 
  select  symbol ,day ,val 
,(:coeff * val) as ema_term_1 
,( (1 - :coeff) * 
   -- Of course ema is not defined here, so this fails with 
   -- 'ORA-00904: invalid column name': 
   nvl((lag(ema ,1) over (partition by symbol  order by day))
 ) as ema_term_2 
  from atp_ema 
) v 
order by symbol ,day ; 

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] SQL Window/OLAP functions

2005-10-12 Thread Andrew Piskorski
On Wed, Oct 12, 2005 at 05:12:05AM -0500, pilot pirx wrote:
> Subject: [sqlite] Please, please do _not_ remove this feature from SQLite...

> While using SQLite for some time (with R package,
> I did admire its functionality and speed. Then I did discover a
> hidden SQLite feature of immense usefulness - not available in other
> databases. SQLite can compute Fibonacci numbers! (I will explain why

Transaction visibility features do vary, although often it doesn't
matter anyway.  E.g., here's a dicussion of how (at least as of early
2004), PostgreSQL's docs were quite confused about certain subtleties,
but what I find interesting, is this was still something that in
practice had never really mattered to the mostly hard-core RDBMS
programmers talking about it in that thread:

> val =  (SELECT h1.val FROM fib as h1 where pos = fib.pos - 1) +
>(SELECT h2.val FROM fib as h2 where pos = fib.pos - 2)
> WHERE pos > 2;

I don't see why this is such a great feature.  Without it, worst case,
you could still write a simple little loop which would issue one
update statement for each row, all within a single transaction.  No?

> This is an _immensely_ useful functionality when one needs to
> compute various recursive functions. For example exponential moving
> average, used frequently in financials. Or Kalman filter (and many

Vastly more useful for moving average and the like would be real
windowing/grouping functions, like Oracle's "analytic" functions.  I'm
not thrilled by their particular syntax, but the functionality is
INCREDIBLY useful.  (And on the other hand, I haven't thought of any
obviously better syntax, either.)

Hm, an amendement to the SQL:1999 spec added windowing support, and
SQL:2003 includes that, I think as features T611, "Elementrary OLAP
functions" and T612, "Advanced OLAP functions".  Apparently Fred Zemke
of Oracle was the author of that SQL spec, and IBM also supported it,
so the SQL:2003 syntax and behavior is probably very similar (maybe
identical?) to what Oracle 8i, 9i, and 10g and IBM's DB2 already have.
PostgreSQL, as of 8.0, doesn't support it yet.

SQLite basically supports just SQL-92, it doesn't have any of these
newer SQL:1999 or SQL:2003 features, right?

Using SQLite in conjunction with a powerful statistical data analysis
programming language like R is an excellent example of a use where
windowing functions can be hugely helpful.  Unfortunately, I've never
had a compelling need to use SQLite for that, otherwise I'd probably
take a shot at adding support for the SQL:2003 Window/OLAP stuff.  :)

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-10 Thread Andrew Piskorski
On Sat, Sep 10, 2005 at 07:25:48PM -0400, D. Richard Hipp wrote:

> difficulties, I have now modified the query optimizer so
> that it will no longer reorder tables in a join if the join
> is constructed using the CROSS keyword.  For additional

This is a one-off to control one particular feature (table reordering)
of the optimizer.  Is it likely that in the future, perhaps as the
optimizer grows more complex and featurful, that programmers will want
to be able to give further such directives or hints to the optimizer?

If the answer is yes, then maybe it would make more sense to provide
an actual syntax or language for giving SQLite such hints, probably by
embedding them into specially formatted SQL comments (which is
Oracle's approach).

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] v3 number handling and relational design help...

2005-09-07 Thread Andrew Piskorski
On Wed, Sep 07, 2005 at 07:23:59AM -0700, Jay Siegel wrote:

> I'd like to create another table with
> "AverageSalaries" combining the ID's from (a) and (b)
> into a unique 8 byte ID.  I'm thinking that the
> columns would be "SuperID" (8 bytes integer created by
> combining each ID from the city table with the ID from
> each job table and "AveSalary" which is the average
> salary for that job in the given city (based on the

That seems bizarre.  Typical RDBMS practice would be to just use a
two-column composite primary key.

> ID's).  The reason for combining the ID's is to create
> a unique ID that can be a key for quick access.
> Again, this is a simplistic example but demonstrates
> the issue I'm dealing with among multiple tables.  Is
> this combining of ID's the best way to handle this
> "join" when performance is required?  Can an 8 bytes

Did you measure a serious performance problem when joining on two
separate columns?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Survey: NULLs and GROUP BY

2005-09-01 Thread Andrew Piskorski
On Thu, Sep 01, 2005 at 02:51:21PM -0400, D. Richard Hipp wrote:

Oracle9i Enterprise Edition Release

SQL> SELECT a, b, sum(c) FROM t1 GROUP BY a, b ORDER BY 3; 
 A  B SUM(C) 
-- -- -- 
 1  2  2 
2  4 
 1 8 

> Can I get some volunteers to run the SQL shown below on various
> other SQL database engines and tell me what the output is?
>SELECT a, b, sum(c) FROM t1 GROUP BY a, b ORDER BY 3;

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Multi-threading.

2005-07-27 Thread Andrew Piskorski
ter Van Roy, Seif Haridi

A reasonable analogy to "threads vs. processes" (at least for Linux
folk) may be, "Gnome vs. KDE".  Now, "Gnome vs. KDE" is not a useless
question.  Although I personally hardly care, there must be real
differences between Gnome and KDE, and there are probably sound
engineering reasons for debating the relative merits of the two
designs, sometimes.

But "Gnome vs. KDE" really boils down to a bunch of more specific
questions about things each due, AND even the sum total of each of
those things are probably pretty trivial compared to, "Good GUI user
interface design and programs and toolkits to support it" - which is
the REAL problem that both Gnome and KDE are supposed to be solving.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Multi-threading.

2005-07-15 Thread Andrew Piskorski
On Fri, Jul 15, 2005 at 01:04:50PM -0400, Paul G wrote:

> the issue wasn't necessarily the thread implementation per se, but the fact
> that threads were treated as processes for scheduling purposes and hence
> scheduled with the regular process scheduler, which was not efficient for a
> large number of processes. these problems went away when ingo molnar's O(1)
> scheduler was adopted (not sure when it was merged into linus' 2.4, but

Interesting.  Then that may explain why I never heard any first hand
reports of notable Linux threading problems.  The people I tended to
talk to were probably all running with well under 100 threads per
process, and only 2 or 3 such processes at most.  Presumably even the
earlier lousy process scheduler could handle that ok.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] client/server

2005-06-07 Thread Andrew Piskorski
On Tue, Jun 07, 2005 at 11:52:30AM -0300, Mart?n Schamis wrote:
> Hi, Eugene and everybody.
> The problem I have is this, I?m implementing a web aplication that will
> require up to 300 users writing at

Then why are you even looking at SQLite?  IMNSHO, unless you have some
compelling reasons otherwise, you should default to using PostgreSQL
for that sort of thing.  You haven't said much about what you're
really doing, but a large busy database-backed website with many
concurrent users is exactly the sort of OLTP niche where capable
client/server RDBMSs like PostgreSQL or Oracle shine.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] philosophy behind public domain?

2005-05-31 Thread Andrew Piskorski
On Tue, May 31, 2005 at 04:21:44PM -0700, Gerry Snyder wrote:

> Does anyone see a problem with releasing code under the user's option of 
> BSD or GPL? I much prefer the latter, so that added work (if 

Then you might as well just release it as BSD only in order to reduce
confusion, and encourage uptake by all the projects which refuse to
use any GPL code.

AFAIK anybody who wants to can take BSD code and fork into a GPL
project (effectively a one-way process, once they start applying
GPL-only patches), so you're not taking anything away from GPL users
by releasing only under BSD.  AKA, BSD is "GPL compatible", and GPL
users are very much aware of that.

But BSD users are also quite aware that the BSD -> GPL process is one
way, so if you're willing to release your code under BSD at all, why
take the risk of confusing or alienating BSD-only users?  Personally,
I would release as either GPL or BSD, not both.

Dual-licensing can make sense sometimes (aka, give useful options that
otherwise do not exist), but I don't think it does in the case of GPL
+ BSD.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] List of latest changes in CVS

2005-05-06 Thread Andrew Piskorski
On Fri, May 06, 2005 at 08:59:28AM -0400, Thomas Briggs wrote:
>Is there a way to see a list of the latest changes made in CVS?  I

Use this:

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] High throughput and durability

2005-04-11 Thread Andrew Piskorski
On Mon, Apr 11, 2005 at 03:59:56PM +0200, Thomas Steffen wrote:
> I have a problem where I need both a high throughput (10%
> write/delete, 90% read) and durability. My transactions are really
> simple, usually just a single write, delete or read, but it is
> essential that I know when a transaction is commited to disk, so that
> it would be durable after a crash.

Why do you want to do this with SQLite, rather than something like
PostgreSQL?  Sounds like you have both concurrent writers AND
concurrent readers, all at the same time, which is going to totally
hose your performance on SQLite.  Do you have some hard constraint
that requires an embedded in-process database library like SQLite,
rather than a client-server RDBMS?

Even if you MUST have an embeded db, I would still test against
PostgreSQL, as that should tell you whether MVCC can solve your
problems.  Embedded databases that support MVCC and/or other
techniques for much better concurrency do exist, you just might have
to pay for them.

You didn't mention your transaction rate, nor what your application
even is, but general purpose RDBMSs are specifically designed to
support transaction processing adequately, so unless your transaction
rates are truly huge, an RDBMS with MVCC (PostgreSQL, Oracle) would
probably work fine for you.

I suspect it's not your total transaction load that's a problem, it's
simply that SQLite doesn't support the concurrency you need.  Of
course, if that's the case, one solution would be to add MVCC support
to SQLite, as has been discussed on the list in the past.  That would
be cool.  :)

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] beat 120,000 inserts/sec

2005-04-09 Thread Andrew Piskorski
On Sat, Apr 09, 2005 at 11:49:17AM -0400, Al Danial wrote:

> Thanks to everyone who posted performance numbers and machine
> setup info.  Some results were counterintuitive (I'd have guessed
> SCSI drives would come out on top) but many variables are at work

It is basically impossible that a 10k or 15k rpm SCSI disk is
genuinely substantially slower for writes than a 7200 rpm IDE disk.
Most likely, the SCSI disks had write-through cache turned off, and
the IDE disks had it turned on.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] idea: sqlite3_begin() and sqlite3_end() ?

2005-04-01 Thread Andrew Piskorski
On Fri, Apr 01, 2005 at 01:56:16PM -0500, Ned Batchelder wrote:

> I decided for my own work that magic nested transactions are a bad idea.  To
> properly handle errors and rollback the database, you need to know where the
> edges of the transaction really are.  Fully-supported nested transactions

Sounds like excellent advice to me.  AKA, don't lie to the application
about what the transaction is really doing.

Some bigger databases (I forget which) give you real nested
transactions, some (like Oracle) don't, but give you savepoints
instead, which are equivalent but with slightly different syntax.

I think I've only ever used savepoints in one Oracle project, and that
only in 4 or 5 places in one module of a fairly large and complicated
project.  So, although savepoints or nested transactions are nice, it
seems entirely reasonable to me that a small and simple database like
SQLite doesn't have them at all.  Heck, I don't think even PostgreSQL
had them until version 8.0.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] atomic db replacement

2005-03-18 Thread Andrew Piskorski
On Thu, Mar 17, 2005 at 08:33:03PM -0700, Ara.T.Howard wrote:
> On Sat, 12 Mar 2005, Andrew Piskorski wrote:
> >On Sat, Mar 12, 2005 at 10:03:25AM -0700, Ara.T.Howard wrote:
> >
> >>does anyone have a strategy for doing massive updates to a db and atomicly
> >>replacing it in a multi-process situation?
> >
> >Why would you want to do that?  SQLite properly supports transactions, so
> >aren't you better of just using those?  What scenario are you working with
> >here?  Do you have a large number of readers that MUST not block while a
> >massive/slow update is done tot he database, or something like that?
> precisely - active web site doing both reads and writes: long running update
> to perform.

If besides that one slow update you had ONLY READS (and those reads do
not care which version of the data they get, old or new), I'd say
sure, fine, an atomic "swap in a totally new db" operation might be
worth trying.

But for a database with active reads AND sometimes also writes, all
while that big slow update is going on?  No, even if your special
procedural scheme below works, the whole idea seems misguided.  If you
really must have the possiblity of other writes while that huge slow
update is happening, use an RDBMS which is actually designed to
support that scenario, like PostgreSQL.  SQLite is the wrong tool.

You proposed procedure below is an ugly hack.  I do not know the
particulars of your situation, so maybe for you it really is justified
over the correct "use PostgreSQL" alternative.  But I would think long
and hard about that.

(And no, I don't know what the best/simplest way to do an "atomically
replace whole db" step is with SQLite.  That would be useful to know,

> obviously not too much (but maybe something) will be happening at night.  my
> plan was
>   lock db
>   take copy
>   note mtime
>   release lock
>   perform updates
>   lock db
>   if mtime has been updated
> retry
>   else
> perform atomic replacement
>   end
> perhaps this is simply impossible with sqlite?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Re: [unclassified] Re: [sqlite] getting rid of dirty SQLITE_BUSY workaround

2005-03-15 Thread Andrew Piskorski
On Tue, Mar 15, 2005 at 01:24:51PM -0500, Ned Batchelder wrote:

> The best solution is to rollback entire transactions when the database is
> busy.  This means structuring your code so that all transactions can
> rollback and retry.  In my experience, this is necessary (though *far* less
> frequently) even with the "big boy" databases.  See, for example,

Yet another reason why MVCC is such a beautiful thing.  I've managed
to accidentally get Oracle to abort a transaction due to detected
deadlock only once or twice over the years, and I had to work pretty
hard to do that - lots of complicated PL/SQL code which was taking row
and/or table locks in differing table orders in different places.

And that, of course, was an application bug.  (You must take all locks
in the same table by table order, everywhere.  To damn bad that the
RDBMS doesn't give you any real tools to help you verify that.)  I
don't remember ever seeing deadlock for any other reason in Oracle,
and PostgreSQL (which has effectively the same MVCC model) should be
the same.  Databases using lock-based strategies rather than MVCC are,
of course, more susceptible do deadlock problems under high

On the other hand, Oracle has more than once given me the joy of
aborting my big huge special purpose transaction with a "not enough
rollback space" error.  Oops, gotta turn those annoyingly manual dba
knobs some more.  PostgreSQL is probably better in that respect, as
its "rollback" space is effectively in the table itself, which will
just keep getting bigger and bigger as necessary.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Re: [unclassified] Re: [sqlite] getting rid of dirty SQLITE_BUSY workaround

2005-03-14 Thread Andrew Piskorski
On Tue, Mar 15, 2005 at 12:21:15AM +0100, Thomas Lotterer wrote:
> On Sun, Mar 13, 2005, jed wrote:
> > [...] web applications fit well into the model of "many readers, one
> > writer", sqlite does this very well.
> > 
> Well, there might be web applications which are read-only from the web's
> view. But to be honest, most of them also call for occasional writes.
> Think of a simple address book. Also I think of uses like tracking
> session cookies which also use occasional writes. In all those cases

I have not tried this with SQLite, but if I was using SQLite for such
an app, I assume I would need to serialize all such writes through a
single thread or process.  E.g., in my web server, the connection
thread servicing the user request would send a message to my one
single db writer thread, saying, "Heh, please insert this data for
me!".  Then the conn thread would simply block until the db writer
thread replies with, "Yup, your transaction is committed."

That is a uglier than what you'd do with a real server-based RDBMS
like Oracle or PostgreSQL, but it should scale fine until you have
either:  One, very large numbers of hits on your simple and efficient
web app.  Or two, a complicated web app with many potentially long
running transactions, etc.  In the real world, the second concern is
much more likely to bite you than the first, and cries out for a more
capable, more general purpose database than SQLite.

It would be nice if SQLite had MVCC, which would let it scale much
further up into the PostgreSQL-like realm normally dominated by
client/server databases, but given the "simple, small, embedded" niche
that Dr. Hipp intended for SQLite, it's easy to see why adding MVCC
isn't any sort of priority, even if it could be done without making
the code much more complicated.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] atomic db replacement

2005-03-12 Thread Andrew Piskorski
On Sat, Mar 12, 2005 at 10:03:25AM -0700, Ara.T.Howard wrote:

> does anyone have a strategy for doing massive updates to a db and atomicly
> replacing it in a multi-process situation?

Why would you want to do that?  SQLite properly supports transactions,
so aren't you better of just using those?  What scenario are you
working with here?  Do you have a large number of readers that MUST
not block while a massive/slow update is done tot he database, or
something like that?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] thoughts on a web-based front end to sqlite3 db?

2005-03-07 Thread Andrew Piskorski
On Mon, Mar 07, 2005 at 04:22:50PM -0500, Eli Burke wrote:

> So, I was wondering if any of the more opinionated among you would care
> to suggest an interface language. It'll be on a Linux box, presumably 

I qualify as opinionated, so:  Tcl.  The fact that Dr. Hipp supports
Tcl directly for SQLite is yet another bonus.

> running apache although I'm open to alternatives. The app itself uses

AOLserver.  Among other things, it goes very nicely with Tcl.

> Does anyone have alternative suggestions, or if you agree that Python Is 
> Good, would you suggest using APSW, pysqlite, or something else?

People I trust are of the opinion that Python is a Good Thing, but I
know nothing about one of its db APIs vs. another.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] ticket 1147

2005-02-28 Thread Andrew Piskorski
On Mon, Feb 28, 2005 at 11:16:33AM -0700, Robert Simpson wrote:

> Here are just a few things I can think of off the top of my head that I
> cannot do right now for a resultset, but that I *can* do with additional
> schema information:

Do you mean that you would like additional schema information added to
the system tables, so that you could separately figure this sort of
stuff out by querying them?

Or do you want this metadata returned with each and every resultset?
Meaning, essentially, have the db (optionally) return a metadata
resultset along with each normal data resultset.  Is there some
standard precisely specifying what this metadata resultset should look

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] ticket 1147

2005-02-28 Thread Andrew Piskorski
On Mon, Feb 28, 2005 at 09:58:15AM -0800, Tim McDaniel wrote:

> Given a specific SELECT statement, ADO.NET has the capability to
> automatically build the corresponding INSERT, UPDATE, and DELETE
> statements, so the user can insert/update/delete values/rows in the
> resultset and have those modifications sent back to the database.  But

> (I wrote the original ADO.NET SQLite wrapper on sourceforge)

Hm, off topic, but I'm curious:  Presumably ADO.NET does not take a
lock out on all those rows and wait around holding it while the human
user goes to lunch.  So, when the user changes values and then submits
them, does ADO.NET somehow correctly check that another transaction
has not modified those same rows in the meantime?  And what does it do
then, throw a "Someone else has changed your data in the db"

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] ticket 1147

2005-02-28 Thread Andrew Piskorski
On Mon, Feb 28, 2005 at 05:38:24PM -, Tim Anderson wrote:

> > > SELECT Name, Title, Books.ID, Authors.ID FROM Books inner 
> > join Authors 
> > > on Books.AuthorID = Authors.ID ORDER BY Authors.Name, Books.Title;

> Not quite. You wanted the column called "Books.ID" so that was
> specified. Is it unreasonable to then expect to retrieve it as

No, it was not.  THIS specifies that you want the column name to be

  select Books.ID as "Books.ID" from ...

I suspect that is per the SQL standard, although I have not checked.
Note that "." is not normally allowed in column names so you have to
surround it in double quotes.  Btw, I haven't tried this in SQLite but
that's how it works in Oracle, e.g.:

  SQL> select u.username from user_users u; 
  SQL> select u.username as "u.username" from user_users u; 
  SQL> select u.username as u.username from user_users u; 
  select u.username as u.username from user_users u 
  ERROR at line 1: 
  ORA-00923: FROM keyword not found where expected 

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] ticket 1147

2005-02-28 Thread Andrew Piskorski
On Mon, Feb 28, 2005 at 05:05:37PM -, Tim Anderson wrote:

> SELECT Name, Title, Books.ID, Authors.ID FROM Books inner join Authors
> on Books.AuthorID = Authors.ID ORDER BY Authors.Name, Books.Title;
> In this case, the query is unambiguous, but by default Sqlite returns
> the column names as:
> Name
> Title
> ID
> ID

Well, that looks like correct behavior to me.  If you wanted the
column to be called something different you should have done
"... Authors.ID as Authors_ID" or something like that in your query.

So if these various db interface layers need special features from
SQLite in order to rewrite those column names, then it must be because
the users of those interface layers are writing bad queries, queries
that fail to specify the unique column names that the users actually
needs or wants.  Why is that?  Having the user application correctly
tack on an "as my_col_name" to the approriate columns in the query is
pretty trivial, so why don't these user applications correctly do
that?  Is it genuinely infeasible for some reason?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Solaris 2.5 compilation

2005-02-11 Thread Andrew Piskorski
On Fri, Feb 11, 2005 at 01:04:40PM -0500, Christopher R. Palmer wrote:
> I had problems compiling the 3.0.8 version of sqlite on my Solaris box. 
> The version of "awk" on Solaris does not support the gsub function which 

What about using nawk or gawk?  nawk at least is in the default
Solaris install, and last time I used Solaris, the provided nawk and
awk definitely WERE different.

At least a few years back, I believe Solaris awk was old awk, while
many (most?) other awks are new awk, and thus equivalent (or at least
more similar) to Solaris nawk.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] SQLite Advocacy

2005-02-02 Thread Andrew Piskorski
On Wed, Feb 02, 2005 at 08:18:21PM +0200, Stanislav Malyshev wrote:
> AP>>:)  Yes.  AFAICT, the only excuse for the existence of MySQL and PHP at
> AP>>all is the ignorance of their creators.  They are poor tools for the

> Thanks for extending horizons of my knowledge.

Why, you're quite welcome.  I'm always happy to engage in intelligent
discussion on interesting topics, and while I can occasionally be
obnoxious, or even wrong, I do strive both to convey a true and
accurate picture of the facts, as I best understand them, and to
better understand them myself.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] excessive malloc() calls

2005-01-10 Thread Andrew Piskorski
On Mon, Jan 10, 2005 at 09:51:50AM -0800, [EMAIL PROTECTED] wrote:

> sqlite3_step() is the function that is making all the excessive calls, one
> call per row fetched.

> about 3000 of these tables. One complete scan of all 3000 companies will
> make roughly  5.5 million malloc() calls!

That seems highly non-optimal.  If there is no other way around that
already present in SQLite, and you don't want to hack one in, then the
obvious step might be to link in one of the several high-performance
malloc replacement libraries out there, e.g. Hoard:

Tcl 8.4.x includes a threaded allocator (which is used by AOLserver),
which maintains its own thread-local pools of memory and calls the
system malloc() underneath only infrequently.  It is not intended as a
drop in replacement for malloc(), but I THINK the various malloc
replacement libraries operate in similar (though more complex)

I'd expect (but have not tested) that these memory pooling techniques
would offer you a speed benefit even when you're using only 1 thread,
as malloc'ing millions of times is usually not a good idea.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] New to SQLite, couple of questions

2004-12-27 Thread Andrew Piskorski
On Mon, Dec 27, 2004 at 08:51:11PM -0800, Roger Binns wrote:

> Why do you need SQLite if you want everything in memory?  All you gain is
> an SQL Parser to operate on the data structures instead of just doing it
> directly yourself.

All?  You gain the relational model, which in some cases could be a
huge win.  I have considered using in-memory SQLite (haven't actually
done it yet) rather than lots of ugly hash tables for exactly that

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] sqlite3_open() exclusive?

2004-12-25 Thread Andrew Piskorski
On Fri, Dec 24, 2004 at 07:32:07PM -0500, John Richard Moser wrote:

> I thought sqlite databases weren't supposed to be opened with two sqlite
> processes at once.  There are unimplemented locking commands due to this
> right?
> I'm bouncing back and forth in my head trying to decide if I should use
> mysql or sqlite to write a package manager.  I'd like to use SQLite

This seems like a rather strange design question to be "bouncing back
and forth" on.  Do you want to use a client server or an embedded
database for your application?  If you want client server, then you
get to choose from MySQL, PostgreSQL, Oracle, etc. etc.  If you want
an embedded database, then you'd be looking at things like SQLite,
Metakit, etc.

The particular or peculiar attributes of each such piece of software
might even lead you to re-consider whether you want client server or
embedded, but simply jumping ahead to "MySQL vs. SQLite" seems decidly
innappropriate.  Perhaps you have other unmentioned constraints, but I
personally can't think of ANY case where I would conclude, "Oh, I
can't use SQLite for that, so I have to use MySQL." - nor vice versa,

> because it's a single library reliant on pthreads and libc, and thus
> lightweight and potentially ok for embedded systems; worst case, I write
> a RDBMS shell around SQLite especially for the package manager.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Any way of using an implicit BTree ordering in SQLite?

2004-12-21 Thread Andrew Piskorski
On Sun, Dec 19, 2004 at 12:09:08PM +0100, Bertrand Mansion wrote:

> For web applications (sqlite being now the default database for PHP5), 
> COUNT(*)
> performance is more important than INSERTs and DELETEs performance. The 
> obvious

That sounds VERY much like a matter of opinion.  I strongly suspect I
could find plenty of high-powered database backed website developers
who would argue the exact opposite.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Any way of using an implicit BTree ordering in SQLite?

2004-12-15 Thread Andrew Piskorski
On Tue, Dec 14, 2004 at 09:16:35PM -0600, Kurt Welgehausen wrote:
> > ...I'd like to use a table as a "pure" BTree ...
> If you mean a general multi-way B-Tree, I don't think
> there's any practical way in SQL.  If you can use a
> binary tree, there are ways.  The most convenient is
> Joe Celko's method, which he calls nested sets.  The

If that indeed is the type of tool the original poster was asking for,
he may want to look at the "Tree Sortkey" implementation in OpenACS,
for PostgreSQL.  OpenACS uses it for hierarchal SQL queries -
completely different than Oracle's "connect by", but used for much the
same reasons.  They considered Celko's nested sets (and in fact
implemented them for testing, I believe), but ultimately chose the
Tree Sortkey implementation instead.  They've been using it since 2000
or so.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] SQLite 3.08 Deadlock in a Linux Multithreaded Multi-Process Application

2004-12-02 Thread Andrew Piskorski
On Thu, Dec 02, 2004 at 11:33:33PM +0200, Ron Avriel wrote:
> Hi,
> I'm using SQLite 3.0.8 on Linux in an Apache module, in a
> multithreaded multi-process configuration.  Occasionally, the database

BOTH mulit-threaded and multi-process?  Why are you doing that?  Is
this some sort of special Apache wackiness?  (I am not familar with

> I'm NOT using transactions.

To be a bit pedantic, yes you are...  You may not be explicitly
starting a transaction, but as with any RDBMS worthy of the name, you
are using them.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] SQLite for large bulletin board systems?

2004-08-27 Thread Andrew Piskorski
On Fri, Aug 27, 2004 at 05:30:40PM -0400, D. Richard Hipp wrote:

> The best design would be to make the application generic so
> that it could use either SQLite or a client/server database.

In an ideal world, yes.  In practice...  My guess is it's probably a
LOT more trouble than it's worth.

SQLite and (for example) PostgreSQL are pretty different.  Much more
different than Oracle vs. PostgreSQL, and those are different enough
(even though they share virtually identical MVCC concurrency models)!

Also, a more hand-wavy argument: If you're app is going to maybe -
ever - need the scalability of a client server database like
PostgreSQL, your target audience is likely such that you'll want to
use other features of the client server database as well.

E.g., contrast a stand-alone discussion board on the web page of a
local club, vs. an entire company or university intranet with many
different applications, all integrated.  In the latter case, well, if
you really want that software to be used by and scale to a Fortune 500
company, that audience is also going to want a whole lot of features
that your local stamp collecting club would never care about.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] SQLite for large bulletin board systems?

2004-08-27 Thread Andrew Piskorski
On Fri, Aug 27, 2004 at 03:45:30PM -0500, Larry Kubin wrote:
> Hello everyone. I am interested in creating a PHP/SQLite powered
> bulletin board system similar to phpBB. However, I have read that

In that case, please pay careful attention to the features and UI of
the OpenACS Forums package.  It does have its flaws and lacks, but
it's the only web-based bulletin board software I've ever personally
used that doesn't suck.  (There may be other non-sucky bulletin board
apps, but I've never used seen or used them.)

E.g., this old-ish running instance:

Or here's a related code-base (OpenACS 3.x actually, very old) with a
different look and feel - but very similar functionality:

One flaw with those particular versions of OpenACS Forums is that they
only allows two types of input, "plain text" and HTML.  Other OpenACS
applications offer a pallete of 3 or 4 standard textual input types:
Plain text, HTML, Preformatted Text, and simplified HTML markup.
Those various input options can be VERY useful for certain classes of
users and types of discussions.

If you want to read some anecdotes from someone else who implemented a
web-based bulletin board (long, long ago), check out "Case 4: The
Bulletin Board", here:

> SQLite is best suited for applications that are mainly read-only
> (because it locks the database during writes). Do you think a SQLite
> powered bulletin board is a bad idea? How would I go about handling

Well, yes.  But only because I think Yet Another stand-alone PHP
bulletin board package is probably a bad idea; nothing to do with

> the case where two users are trying to write to the database
> simultaneously?

SQLite has limited write concurrency, but my guess is that the average
website running some bulletin board software will never even come
close to getting enough concurrent user submissions going at once to
cause much trouble due to writes.  SQLite will (unnecessarily)
serialize them all (while PostgreSQL or Oracle would not), but that
should be fine, for most sites.

You MIGHT however have a VERY large number of peak concurrent readers
(Slashdot Effect), so you should think about how to best use SQLite to
insure that a small number of writers can't starve your thousands of
readers.  That should be doable, one way or another.  E.g., one simple
(not necessarily the best) way might be to simply cache the
highest-hit pages in memory, and only update the cache at most once
every 4 seconds or so.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] bind variables

2004-08-24 Thread Andrew Piskorski
On Tue, Aug 24, 2004 at 03:15:30PM -0400, Andrew Piskorski wrote:

> Btw, I've used these database APIs and know that they all use ':' to
> indicate a named bind variable which then maps to a Tcl variable, in
> very much the same scheme you've explained above:
> - AOLserver's Oracle and PostgreSQL drivers.  (Which is then used and
>   further extended by the OpenACS db api.)
> - nstcl (which wraps Oratcl, Pgtcl, etc.)

Oh, and Oracle's sqlplus interface also uses ':' for bind variables, I
forgot to add that one.

I've no idea whether the SQL standard says anything about bind
variables or not.

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] bind variables

2004-08-24 Thread Andrew Piskorski
On Tue, Aug 24, 2004 at 02:55:51PM -0400, D. Richard Hipp wrote:

> In this way, I get to specify TCL variables directly in the
> SQL statement, not as parameters added to the end.  For
> example:
>db eval {UPDATE t1 SET value=$bigblob WHERE rowid=$id}

Dr. Hipp, this is a great little feature, but is there some reason you
chose to use '$' to indicate a bind variable rather than the more
typical ':' symbol?

At least for Tcl, '$' seems like a particularly poor choice, as it
means you'll have to escape it whenever using double quoted strings in
Tcl.  E.g.:

  db eval "UPDATE t1_SET value=\$bigblob WHERE rowid=\$id  $and_clause"

Is there some advantage to using '$' here that I've missed?

Btw, I've used these database APIs and know that they all use ':' to
indicate a named bind variable which then maps to a Tcl variable, in
very much the same scheme you've explained above:

- AOLserver's Oracle and PostgreSQL drivers.  (Which is then used and
  further extended by the OpenACS db api.)
- nstcl (which wraps Oratcl, Pgtcl, etc.)

I don't know what other db APIs do or don't do with bind variables.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] locking via windows nfs client

2004-08-20 Thread Andrew Piskorski
On Fri, Aug 20, 2004 at 08:45:06AM -0600, Ara.T.Howard wrote:
> On Fri, 20 Aug 2004, Matt Sergeant wrote:
> >As far as linux -> nfs access goes it all depends on the quality of
> >the NFS implementation, and the kernel drivers you're using. It's
> >not easy to create a stable NFS locking system. We've had lots of
> >problems with it. If you can, go with local disk.
> i cannot - the system is a priority queue system where the queue db
> is accessed via around 30 nodes.  i am wrapping any code that will
> grab a write lock with an nfs safe lockfile creation to ensure only

So, why are using embedded SQLite code on each of the 30 nodes to
access the single database file over NFS?  Isn't your situation
EXACTLY what the client-server RDBMS was invented for?

Why don't you simply run PostgreSQL (or your own little custom RDBMS
using SQLite) on one machine, and have all 30 clients talk to it over
the network?

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] Views and performances...

2004-08-12 Thread Andrew Piskorski
On Thu, Aug 12, 2004 at 03:38:21PM +0100, Christian Smith wrote:

> AFAIK, views are compiled into their query definitions when executing a
> statement, so once the compilation is done, performance should be
> identical to their raw query form.

> In short, create a view if you have a common filter on some data. If
> performance is a factor, precompile the view.

At least in Oracle and PostgreSQL, the problem with views is that if
you then re-use them in a complicated query, you MAY get much worse
performance than if you wrote the query against all the underlying
tables directly.  This is because:

1. Sometimes the query optimizer is just dumb about optimizing queries
containing views.  (E.g., doing joins against views in Oracle,

2. Sometimes it's not the query optimizer's fault at all; by using the
view in your query, you actually end up asking the RDBMS for something
subtly different, which (although the query results may happen to be
the same) makes your query legitimately much slower.  Tuning your
query to fix this may require eliminating use of the view.

SQLite's query optimizer is presumably much simpler than that of
either Oracle PostgreSQL, so it probably shares these sorts of
problems.  But views are still good.  Generally, I'd only remove use
of a handy view once you see a real performance problem (a slow query
with a bad query plan).

Andrew Piskorski <[EMAIL PROTECTED]>

[sqlite] Cloudscape?

2004-08-03 Thread Andrew Piskorski
Anyone know much of anything about IBM's Cloudscape database?
Advantages or disadvantages vs. SQLite?,261733,39155170,00.htm

I hadn't heard of it before, so I'm curious.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] any webhost that supports PHP and SQLite?

2004-07-22 Thread Andrew Piskorski
On Thu, Jul 22, 2004 at 10:16:34AM -0400, gohaku wrote:

> but I am looking for a webhost that supports PHP and SQLite.
> Is there such a webhost or will I have to wait until some webhosts 
> upgrades to PHP5?

I assume you are looking for inexpensive hosting on a shared machine,
but since you are doing a database backed website, want as much power
and flexibility as you can get for a modest amount of money.  In that
case, generally the best choice is one that let's you install and use
any software you want, and ideally gives you root access via something
like FreeBSD's jail facility, or the equivalent Linux kernel patch.

Acorn Hosting offers exactly that (and has gotten good reviews), some
of the others listed here do too:

Those folks are mostly interested in OpenACS, AOLserver, Tcl, and
PostgreSQL, of course, not Apache, PHP, or SQLite.  So all else being
equal, you might be a bit better off with an otherwise similar hosting
provider who is more focussed on the toolset that you want to use.
That really doesn't matter much though, because with a virtual server
arrangement, you should be able to do anything you want.

Andrew Piskorski <[EMAIL PROTECTED]>

Re: [sqlite] unsolved RDBMS problems Re: [sqlite] vers 3.0 concurrency issues

2004-05-06 Thread Andrew Piskorski
On Thu, May 06, 2004 at 03:20:13PM -0400, Andrew Piskorski wrote:

> - User defined types, aka good "object" support (Date's "Third
> Manifesto").
> - Native bi-temporal support, or even just good support for one of
> valid-time or transaction-time (Snodgrass).  This one in particular I
> would defnitely have used if it was available.  (This is probably
> related to supporting other dimensions well, like time.)  See also:

[Oops, that parenthetical comment about "other dimensions" belonged in
the first bullet point further above about type extensability, not in
the second one on bi-temporal support.]

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] unsolved RDBMS problems Re: [sqlite] vers 3.0 concurrency issues

2004-05-06 Thread Andrew Piskorski
On Thu, May 06, 2004 at 01:21:28PM -0500, Puneet Kishor wrote:

> Frankly, I am not sure if there is anything exciting left in relational 
> databases to discover or create... most has been created and 
> well-tested over the past 3 decades. What is left is making a tool 

No way, that is not true!  Certainly both C.J. Date and Rick Snodgrass
would both vehemently disagree, although probably for different
reasons.  Some concrete examples:

- User defined types, aka good "object" support (Date's "Third

- Native bi-temporal support, or even just good support for one of
valid-time or transaction-time (Snodgrass).  This one in particular I
would defnitely have used if it was available.  (This is probably
related to supporting other dimensions well, like time.)  See also:

- Schema version control!  CVS does an adequate job of automatically
diffing and merging text files, but leaves lots of room for
improvement (witness BitKeeper, Arch, all the commercial systems,
etc.).  AFAIK there is *NO* even CVS-level tool available for
versioning and converting relational schemas ** which contain data **.
If your database is empty, changing your schema is trivial.  If it is
full of data, making non-trivial modifications to your schema can
quickly become very, very obnoxious.

My guess is that 80% of the schema upgrade effort is automatable, but
it has not been automated.  (It can't be fully automated because when
altering tables and re-factoring data, the schema patch tool can't
always know where you wan tthe data to move from/to, an you the
programmer need to tell explicitly tell it.)  (I think I heard that
Toad may help somewhat with that sort of thing for Oracle, but I never
looked into it.)

- Replication, scalability.  It sure would be nice to just run a big
huge RDBMS on a bunch of cheap 1U Linux boxes rather than a single big
expensive SMP box, and just rack and stack a few more 1U boxes each
time your database or user load gets bigger.  Depending who you ask,
this seems to currently either be not feasible at all, or so
complicated and with such poor scalability that hardly anyone even
tries it.  See also:

There are probably many other examples of significant unsolved RDBMS
problems, too.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] vers 3.0 concurrency issues

2004-05-06 Thread Andrew Piskorski
On Thu, May 06, 2004 at 01:21:28PM -0500, Puneet Kishor wrote:

> they are as real a database as one wants them to be. Sure, they don't 
> support ACID compliance, but I am not sure if they are created by 

Ugh, that particular argument is one I should not have started.  My
apologies to all, and let's just let that one lie.

> I have recently created a pretty useful app for a client using
> Perl/DBI, DBD-ODBC and Access. Given how fluid the design was
> throughout the development, I simply could not have done so with
> SQLite because of the lack of an ALTER command.

What makes you so dependent on "alter table" in your development?

I've done a lot of data modeling work in Oracle, but have only rarely
used "alter table" by choice.  When I did use it, it was generally
carefully tested upgrade scripts which I needed to apply to an
existing instance already in Production.  When loading test data into
a Development instance, I would generally prefer to do that completely
from scratch - run all my create scripts, then separately load data
in.  No alters involved.

Hm, however, a different scenario is if you are copying data from
Production to a Dev instance for development or testing purposes.  In
that case, dumping/exporting Production and loading everything right
back verbatim into Dev is certainly the easiest way to go, and once
you have that in Dev, you ARE going to want to use a lot of

Is that the situation you find yourself in?  I.e., you are not
developing something new from scratch, but are making extensive
changes to an already system already in heavy Production use?

The "export out of Prod, import into Dev", method has many advantages,
especially for testing.  Its drawback is that it's easy to quickly
lose all ability to create your data model from scratch...
Unfortunately there seems to be no way to have the best of both
worlds.  Either you put the extra effort into maintaining SEPARATE
create scripts and upgrade scripts, or you lose the ability to do one
or the other at all.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] vers 3.0 concurrency issues

2004-05-06 Thread Andrew Piskorski
On Thu, May 06, 2004 at 06:24:10PM +0100, Steve O'Hara wrote:

> However, I'm wondering why we're comparing SQLite with kernel based RDBMS
> like Oracle etc, and not with it's more closely related cousins such as
> Access ?

In my case, because I am very familiar with Oracle, somewhat less so
with PostgreSQL, and much less familiar with most other databases.
Also, because databases like Oracle and PostgreSQL are the current
gold standard for functionality.  Why compare to anything less?

> In this regard, how does SQLite compare with Access/FoxPro/Paradox/dBase ?

Because none of those are Real Databases?  ;)  [Joking, joking...]

Well, MOSTLY joking.  Actually, I've no idea to what extent those each
qualify as "Real Databases" (aka, full ACIDity, featureful, designed
and implented by folks who have a clue, etc.).

However, I am reasonably sure that they're neither as powerful as
PostgreSQL nor as small and simple as SQLite, and that none of them
are Open Source.  Therefore, they are not particularly interesting -
not to me anyway.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] vers 3.0 concurrency issues

2004-05-06 Thread Andrew Piskorski
On Thu, May 06, 2004 at 09:54:24AM -0400, D. Richard Hipp wrote:

> Concurrency is not nearly as much an issue in reality
> as it is in many peoples imagination.  Concurrency
> probably is not an issue for a website. If concurrency
> really is an issue, you need a client/server database.

While that might be true [I take no position on whether it is or not],
to some real extent it may also be irrelevent.  Here's my theory:

People often don't know ahead of time, and aren't able to reasonably
predict, just how much concurrency they will need.  Also, their
concurrency needs may change over time, again in ways that are hard
for them to predict.

Therefore, picking a tool which they KNOW has serious concurrency
limitations introduces a significant risk, the risk that they may have
to drastically overhaul their underlying tools and architecture at
some unknown point in the future, due to hitting concurrency problems
and thus suddenly discovering that they're now using the wrong tool
for the job.

Since SQLite's concurrency support is very limited, potential users
are forced to pre-maturely optimize:  Either believe you will need high
concurrency, and go with PostgreSQL or Oracle, or believe you will
need no or low concurrency, and go with SQLite.  Premature
optimization is a bad thing.

There is a missing middle ground there: A medium-concurrency SQLite.

If a future version of SQLite - let's call it "SQLiteMC" - were able
to significantly increase concurrency, while at the same time
preserving all or most of SQLite's historical advantages (simplicity,
etc.), this would significantly expand the problem domain to which
SQLite is applicable.  That, in turn, would reduce the "wrong tool"
risk of using SQLite, and both of those effects would tend to make
SQLiteMC both more useful and more widely used than SQLite is now.

That's my little mental model anyway.  To me the direction of those
effects seems just about certain, but what I don't know - and can't
know, much less quantify - is the size of those effects - they could
be anywhere from zero to large.  And of course, that's assuming that
SQLiteMC succeeds in its technical goals.

And would SQLiteMC succeed in its technical goals?  How hard would it
be, and are there sufficient programmer resources available to make it
happen?  That I don't really know either, but together, Dr. Hipp plus
a few other experts on this list could probably come up with a
reasonably good idea...

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Updating with a Cursor

2004-04-18 Thread Andrew Piskorski
On Sun, Apr 18, 2004 at 11:04:14AM -0700, Chris Waters wrote:

> The idea is that I can iterate through a set of records with the cursor,
> updating as I go. The challenge is working out how to implement the
> UpdateRow() function. I am happy to accept the costraint that the query must

> Implementing another layer of SQL parsing is less than ideal since I have
> found that the parser already dominates sqlite performance for my tests
> (in-memory DB with many different queries using indexes).

Why do you want to write special C cursor code rather than just using
a single SQL update statement?  Performance?  If so, maybe it would be
better to create a way to just cache and re-use the SQL query plan,
rather than throwing out the use of SQL altogether?

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Updating with a Cursor

2004-04-18 Thread Andrew Piskorski
On Sun, Apr 18, 2004 at 11:04:14AM -0700, Chris Waters wrote:

> I am building a C wrapper for sqlite that can abstract away the sqlite API

Why?  Are you trying to make it easier to port your application to
multiple databases?  If so you should look at the ns_db database API
from AOLserver.  It has had a standard API with drivers for many
different databases for a long, long time now (since 1996 or so).

With AOLserver 4.0, it recently became much more feasible to take the
actual ns_db C code and re-use it outside of AOLserver (e.g., by
loading it into a plain tclsh or the like), but AFAIK no one has
actually done so yet.  (It is on my list of things to do eventually,
but I've had no immediate need.)  However, nstcl already re-implements
ns_db without needing any of the AOLserver C code at all, and already
includes SQLite support.

Note however that nstcl implements the generic ns_db API in the Tcl
layer, so if you need a single consistent API in the underlying C code
rather than just in Tcl, then you'll probably need to look into using
the actual AOLserver C code ns_db implementation.  As a quick hack you
could use nstcl and make calls from C to Tcl, but that would be kind
of ugly, and would probably have noticeably poorer performance than
calling the C implementation directly.  (If you're doing it in an
inner loop, the overhead of going from C to Tcl and back to C can be
quite substantial.)

In addition to the ns_tcl SQLite support, there also seems to be a
native AOLserver SQLite driver (beta, from 2003), but for whatever
reason it's not available in the AOLserver CVS, you have to download
it from its author's own website.

> BTW, performance is my primary goal, ease of use is second.

Isn't performance as your my primary goal rather contradictory to
wanting any API abstraction layer at all?  Do you have a real
requirement for a wrapper API (e.g., in order to help support multiple
databases), or is this just an extra Nice to Have feature for you?

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] XML translation inquiry

2004-04-15 Thread Andrew Piskorski
On Thu, Apr 15, 2004 at 02:37:13PM -0400, rich coco wrote:

> >>the goal would be to abstract out of the application the entire
> >>DB schema and allow the application to work exclusively with XML -
> >>and the generated methods - wrt not only data representation and
> >>transport but data storage and retrieval too. (ie, the application need
> >
> >And WHY would you want to do that exactly?  To me it sounds like an
> >excellent strategy to recommend to a competitor.

> please tell me in why it's a bad idea.

It's not that I know it to be a bad idea - I don't.  Perhaps it's
actually a very nice idea.  But without knowing more, it sounds like a
good way to make things a lot more complicated than they need to be
for no gain.  If you're using a RDBMS, why not just use the RDBMS?
Generating some sort of limited XML format for sending over the wire
to some remote application, sure I can see the value of that.  But
converting your entire relational schema to XML for internal use in
your own program?  Why?  What can that possibly buy you over just
using the relational schema directly?

I mean, heck, the relational model was invented for a reason, why
would you want to transform it to a hierarchal XML model and have your
program manipulate that?  And if (for some reason I can't fathom) what
you really want is some sort of tree-structured XML schema, then why
not use an native object or XML database or persistent storage API
intended for that, rather than repeatedly converting between the RDBMS
and XML?

If you tell me you're sure there is some good reason I may be prepared
to believe you, but offhand I can't think of one myself.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] XML translation inquiry

2004-04-15 Thread Andrew Piskorski
On Thu, Apr 15, 2004 at 01:47:01PM -0400, rich coco wrote:

> the goal would be to abstract out of the application the entire
> DB schema and allow the application to work exclusively with XML -
> and the generated methods - wrt not only data representation and
> transport but data storage and retrieval too. (ie, the application need

And WHY would you want to do that exactly?  To me it sounds like an
excellent strategy to recommend to a competitor.

But if you really want to do it, Red Hat CCM/RHEA (open source, in
Java) does include some sort of object-relational mapping layer:

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Effectiveness of PRAGMA integrity_check;

2004-04-15 Thread Andrew Piskorski
On Thu, Apr 15, 2004 at 08:33:14AM -0500, Greg Miller wrote:

> support that. The FreeBSD folks tried to solve this by turning off write 
> caching by default. Unfortunately, this hurt performance so much they 
> had to turn it back on and just recommend SCSI drives for important data.

Why, do SCSI drives all come with battery-backed cache?  (So when you
power them up again they complete the old cached write.)  I didn't
think so, but would be pleased to learn otherwise...

Is there any device available which basically serves as an add-on UPS
for hard drives?  Something you'd just plug all your IDE drives into,
and it would keep the drives going just long enough after a power fail
to finish writing their cached data and spin down cleanly?  AFAIK no
such device is available, but if reasonably priced it sure would be
nice to have.

Of course, with a good UPS *AND* the proper software running to react
to signals from the UPS, you get that sort of protection for free, and
you certainly want the system UPS anyway.  But that's also much more
complicated and vulnerable to failures due to misconfigured software,
so it'd sure be nice to have the hard-drive-UPS as well.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sqlite] Concurrency, MVCC

2004-04-15 Thread Andrew Piskorski
On Thu, Apr 15, 2004 at 02:16:01PM +0100, Christian Smith wrote:
> Right tool for the job. Multiple writers has client/server database
> written all over it. KISS.

No, not true, at least not when the multiple writers are all threads
within one single process, which appears to be the common case for
people who'd like greater concurrency in SQLite.

Also, if multiple writers worked well for the one-process many-threads
case, then if you wished you could write a small multi-threaded
client/server database using SQLite as the underlying storage engine.
As things stand now, the concurrency limitations mean there isn't much
point to doing that.

Simplicity however, is of course an important concern.

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sqlite] Concurrency, MVCC

2004-04-13 Thread Andrew Piskorski
communication in AOLserver.  (Mutex locking is automatic.  Nsvs are
assigned to buckets with one mutex lock per bucket, and the number of
buckets is tunable.)

Using nsvs worked for me, but being limited to only key/value pairs
was UGLY.  Much of the housekeeping data had 1-to-many or even
many-to-many mappings (e.g., each "query" might have many
"request_ids"), and being limited to key/value pairs was both painful
and the source of some bugs.  And I believe that the pain and bugs
could have been a lot worse if my program and its housekeeping data
had more complicated.

What I REALLY wanted was an in-memory relational database, but I
didn't have one.  When I need on-disk persistence I'd normally use
Oracle or PostgreSQL, but certainly it'd be convenient if I could use
the same lightweight both in-memory and on-disk.

So there's a hole in my toolkit, and I'm looking for software to fill
it.  For that, SQLite seems to have the following important

- Relational, transactional, ACID.
- Simple and very high quality code (so I am told; I have not yet read
  any sources).
- Option of running either in-memory or on-disk.
- Pretty good support for SQL features, joins, etc.  (No correlated
  subqueries, but I can live with that.)
- Thread safe. 
- Easily embeddable anywhere C is.

For my needs SQLite seems to have only one problem:  Both readers and
writers must lock the whole database.  Unfortunately, that's a big

Maybe, in that one particular application I would have been able to
get away with using a single thread for all SQLite access, and having
all other threads talk to that one db thread.  I have not measured
SQLite performance and concurrency, nor compared it to nsv/tsv, so I
don't really know, and I would certainly want to make those tests
before hacking on any SQLite code.

But it seems clear that whatever the exact numbers are, SQLite MUST
have much lower concurrency than either Tcl's simple and lightweight
nsv/tsv key value pairs or heavier weight databases like Oracle or
PostgreSQL - and that limits the number of places I'd be able to use

Solutions other than SQLite:

One possible alternative to SQLite is Erlang's Mnesia RDBMS:

It also works both in-memory and on-disk, apparently already has high
concurrency, and can be distributed.  I have not (yet) investigated
it, and I could not tell from its docs what locking model it uses, how
powerful its query language is, etc.  But most importantly, it
definitely requires a running Erlang interpreter to work at all, and
it seems very unclear whether or how well any sort of access from
non-Erlang environments would work at all.

On this list, Mark D. Anderson <[EMAIL PROTECTED]> recently
recommended looking at the "HUT HiBASE/Shades" papers:

The fact that Nokia funded and then canceled the HiBase project is
interesting, as Ericsson developed Erlang, which sounds similar in
some ways.

Unfortunately, it sounds as if the HiBASE project is dead and the code
was never finished.  It also appears to be strictly a main-memory
database, so I'm not sure how applicable any of that work would be so
a database like SQLite (or Mnesia for that matter) which may be used
either in-memory or on-disk.

Anyone here aware of any other good alternatives?

Andrew Piskorski <[EMAIL PROTECTED]>

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]