Ben,
I think you're asking for is a mechanism to define the nature of the primary
key.
e.g. If you could define the primary key as some kind of function or to use
a specific integer field, then the table will be naturally ordered by that
field.
I don't know about other SQL engines, but our baby (BASIS) allows you to
create your primary key like this - this gives us tremendous performance
improvements for some apps. For instances, we have some clients who store
newspaper articles from various publications in a table. The natural way in
which users want to see these articles is sorted by the published date,
latest first - that's fine, but these tables contain 40 million articles and
their queries can produce pretty large sets. We create a primary key based
on the published date which means we don't need to sort their queries.
Just a thought
Steve
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: 08 April 2004 04:05
To: [EMAIL PROTECTED]
Subject: [sqlite] Clustered indicies Was: [sqlite] A proposal for SQLite
version 3.0
Jeff,
Jeff Pleimling <[EMAIL PROTECTED]>
08/04/2004 12:42 PM
To: [EMAIL PROTECTED]
cc:
Subject:Re: [sqlite] A proposal for SQLite version 3.0
At 12:08 PM 4/8/2004 +1000, [EMAIL PROTECTED] wrote:
> I believe you're thinking of a 'clustered index'. This puts the data
> into the order of the index. There can be, of course, only one clustered
> index per table.
> Since at least some of the data in the table is moved around on every
insert,
> regular indexs need to take this into account (usually by indirection,
rather
> then modifying all of the indexes with each insert).
Ahh, I didn't think of that. I don't have any other indices on my table,
so this wouldn't be a problem for me... but I can see now how it would
harm the general case. I guess the simplest implementation would have a
proabition on having -any- other indicies on the table.
> >If a table could be ordered according to an index, rather than having
an
> >external index, I think it would significantly improve the time and
space
> >performance of my databases. I don't know whether my experience would
be
> >shared by other users. It it were something that could go into 3.0 it
> >would at least do me some good.
> Clustered indexes can really slow the performance for OLTP (On-Line
> Transaction Processing) and other systems where data is added/deleted in
a
> mixed fashion. Every time a record is inserted, data is possibly moved
on
> the disk (with page splits causing even more slowdowns).
Yes, that's what's happening already in the index whenever I do an insert.
My thinking was that maintaining an index-ordered table would be less work
overall than maintaining a table with an ordered index. I could be wrong
on that, but I'm not sure I see the flaw in my logic.
> If your system is entirely historic data, that would be great - but if
your
> system is inserting meter readings in (near) real-time, you'd probably
> get a big performance hit.
It's certainly real-time, with the occasional query.
> There are many pros and cons. A google search turns up articles (usually
for
> MS SQL Server) on both side - some people saying 'always' and some
'never'.
I can certainly see how the usefulness of this feature could be limited.
I guess the problem is primarily related to how the rowid is chosen. If I
could choose a rowid that would put the table in the correct order, and
renumber as required I might make some ground. I could order by ROWID,
then... although there may have to be some kind of code tweakage to allow
the where clauses to operate correctly.
Hmm... maybe something like this:
BEGIN TRANSACTION;
SELECT * FROM mytable WHERE
-- if the insertion point is after current data:
INSERT INTO mytable VALUES (MAX(ROWID) + 10, )
-- else if insertion point is between two values
INSERT INTO mytable VALUES (( + )/2, )
-- else rearrange contiguous values
UPDATE mytable SET ROWID = ROWID+1 WHERE ROWID >= AND
ROWID <
INSERT INTO mytable VALUES (, )
END TRANSACTION;
Perhaps the changes to sqlite could be as minimal as providing a facility
to say:
"I promise to keep these rows in an order consistent with this index"
so sqlite will use the index in queries.
Benjamin.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]