Thanks a lot Mike for your detailed answer! I've just read your acticle.
Do you know where I can get the files "fs.c" and "example.c" you mention? I've
been on the FTP (available from www.ddj.com/code/) but I was not able to locate
any file related to your article :( Could you please send them to me by email
if they are no longer available online? (genio570 -AT- hotmail DOT fr).
It seems like I will be able to further develop the system I described in my
previous post with the information you mentioned.
Another quick question: I planned to develop my own indexing algorithm on the
"external database" side. But do you think I could also consider (as another
option) using the "CREATE INDEX" sqlite command (involving of course a complete
scan of the external table's column and then storing the index inside the
sqlite file)? What would you recommend me to do?
In a more general way, I wonder which sqlite SQL commands can involve virtual
tables and which cannot or should not (for instance for performance reasons,
etc.).
Thanks again for your help,
aladdin> Date: Tue, 1 Apr 2008 20:17:54 -0500> From: [EMAIL PROTECTED]> To:
sqlite-users@sqlite.org> Subject: Re: [sqlite] Virtual table used to query big
external database> > The main reason why SQLite's practical limit is in the 10s
of GBs as> opposed to TBs (theoretical) is due to how it tracks dirty pages.
This> is described in the "Appropriate Uses" page>
(http://www.sqlite.org/whentouse.html) but I'll rehash it here for>
convenience. SQLite tracks dirty pages with a bitmap which is> allocated before
each transaction. The size of the bitmap is> proportional to the size (not in
rows but in pages) of the database> (256 bytes for every 1Mb of database), so
as the database grows, the> amount of memory allocated before each transaction
grows. When you get> into the GB range, you are starting to allocate in the MB
range of> dirty page map memory per transaction, which starts to take its toll>
on performance.> > I could be wrong, but from what I know about virtual tables,
there is> no such correlation between virtual table size and the dirty page>
bitmap, as SQLite has no idea how big a virtual table is, nor does it> manage
the data within the vtable. Furthermore, all SQLite really does> in a SELECT
statement on a vtable is call your code to iterate over> it. So really the only
performance issue is how long it takes your> code to iterate over your vtable.
Thus, your table could be in the TB> range, and as long as you are fine with
iterating over its contents,> there is no additional performance issues to
speak of. There are ways> to implement virtual tables such that you can limit
how much of the> table is scanned for certain queries, avoiding having to scan
the> whole thing every time. I wrote an article that touches on this using> the
match() function. Its available online:> >
http://www.ddj.com/database/202802959> > IMO, virtual tables are one of the
most powerful and unique features> of SQLite. There is a bit of a learning
curve, but it's amazing what> you can do with them. It sounds like you going to
have to iterate over> your external table one way or the other. I see no reason
why the> vtable approach would be any slower than any other approach that>
iterates over the data.> > Having said that, while iterating over a large
vtable is not a big> deal (as your program will just step through it one row at
a time),> you need to be careful about getting too fancy with your SQL as you>
may end up triggering a lot of background IO. For example, if you tack> on an
ORDER BY which sorts one of the columns of your vtable, SQLite> will end up
essentially copying the vtable contents into a temporary> file and sorting it,
which may or may not be a strain on your system> depending on how big your
table is (e.g. your vtable is 30Gb and your> /tmp folder is on a 10Gb
partition). So think through what you are> doing when going beyond a simple
SELECT * from big_vtable.> > -- Mike> > On Tue, Apr 1, 2008 at 3:12 PM, Aladdin
Lampé <[EMAIL PROTECTED]> wrote:> >> > Hi all!> >> > Very often, when people
ask this list why they have trouble managing in sqlite a "big table" (50
million lines or more than 10 Go), they are told that sqlite is an embedded
database and is not meant to be used for very big databases/tables.> >> > I'm
currently in the process of designing a specific, read-only, sqlite "virtual
table" in order to enable sqlite to access data stored in an external database
which is specially designed to handle very big tables.> >> > My final objective
is to be able to easily query a big external table (stored in another database)
through the - excellent - sqlite interface.> >> > Now I have this terrible
doubt: will the existing sqlite "limitations" for big sqlite tables also apply
to my read-only virtual tables?> >> > Thus... am I currently losing my time
developing such a "virtual table" with this objective in mind? Or is there a
better way to achieve my objective?> >> > Thank you for your help!> >> >> >
_________________________________________________________________> > Votre
contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte gratuitement
!> > http://www.windowslive.fr/hotmail/default.asp> >
_______________________________________________> > sqlite-users mailing list> >
sqlite-users@sqlite.org> >
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users> >>
_______________________________________________> sqlite-users mailing list>
sqlite-users@sqlite.org>
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_________________________________________________________________
Avec Hotmail, vos e-mails vous suivent partout ! Mettez Hotmail sur votre
mobile !
http://www.messengersurvotremobile.com/?d=hotmail
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users