Thanks a lot Mike for your detailed answer! I've just read your acticle.
Do you know where I can get the files "fs.c" and "example.c" you mention? I've 
been on the FTP (available from www.ddj.com/code/) but I was not able to locate 
any file related to your article :( Could you please send them to me by email 
if they are no longer available online? (genio570 -AT- hotmail DOT fr).
 
It seems like I will be able to further develop the system I described in my 
previous post with the information you mentioned.
 
Another quick question: I planned to develop my own indexing algorithm on the 
"external database" side. But do you think I could also consider (as another 
option) using the "CREATE INDEX" sqlite command (involving of course a complete 
scan of the external table's column and then storing the index inside the 
sqlite file)? What would you recommend me to do?
 
In a more general way, I wonder which sqlite SQL commands can involve virtual 
tables and which cannot or should not (for instance for performance reasons, 
etc.).
 
Thanks again for your help,
aladdin> Date: Tue, 1 Apr 2008 20:17:54 -0500> From: [EMAIL PROTECTED]> To: 
sqlite-users@sqlite.org> Subject: Re: [sqlite] Virtual table used to query big 
external database> > The main reason why SQLite's practical limit is in the 10s 
of GBs as> opposed to TBs (theoretical) is due to how it tracks dirty pages. 
This> is described in the "Appropriate Uses" page> 
(http://www.sqlite.org/whentouse.html) but I'll rehash it here for> 
convenience. SQLite tracks dirty pages with a bitmap which is> allocated before 
each transaction. The size of the bitmap is> proportional to the size (not in 
rows but in pages) of the database> (256 bytes for every 1Mb of database), so 
as the database grows, the> amount of memory allocated before each transaction 
grows. When you get> into the GB range, you are starting to allocate in the MB 
range of> dirty page map memory per transaction, which starts to take its toll> 
on performance.> > I could be wrong, but from what I know about virtual tables, 
there is> no such correlation between virtual table size and the dirty page> 
bitmap, as SQLite has no idea how big a virtual table is, nor does it> manage 
the data within the vtable. Furthermore, all SQLite really does> in a SELECT 
statement on a vtable is call your code to iterate over> it. So really the only 
performance issue is how long it takes your> code to iterate over your vtable. 
Thus, your table could be in the TB> range, and as long as you are fine with 
iterating over its contents,> there is no additional performance issues to 
speak of. There are ways> to implement virtual tables such that you can limit 
how much of the> table is scanned for certain queries, avoiding having to scan 
the> whole thing every time. I wrote an article that touches on this using> the 
match() function. Its available online:> > 
http://www.ddj.com/database/202802959> > IMO, virtual tables are one of the 
most powerful and unique features> of SQLite. There is a bit of a learning 
curve, but it's amazing what> you can do with them. It sounds like you going to 
have to iterate over> your external table one way or the other. I see no reason 
why the> vtable approach would be any slower than any other approach that> 
iterates over the data.> > Having said that, while iterating over a large 
vtable is not a big> deal (as your program will just step through it one row at 
a time),> you need to be careful about getting too fancy with your SQL as you> 
may end up triggering a lot of background IO. For example, if you tack> on an 
ORDER BY which sorts one of the columns of your vtable, SQLite> will end up 
essentially copying the vtable contents into a temporary> file and sorting it, 
which may or may not be a strain on your system> depending on how big your 
table is (e.g. your vtable is 30Gb and your> /tmp folder is on a 10Gb 
partition). So think through what you are> doing when going beyond a simple 
SELECT * from big_vtable.> > -- Mike> > On Tue, Apr 1, 2008 at 3:12 PM, Aladdin 
Lampé <[EMAIL PROTECTED]> wrote:> >> > Hi all!> >> > Very often, when people 
ask this list why they have trouble managing in sqlite a "big table" (50 
million lines or more than 10 Go), they are told that sqlite is an embedded 
database and is not meant to be used for very big databases/tables.> >> > I'm 
currently in the process of designing a specific, read-only, sqlite "virtual 
table" in order to enable sqlite to access data stored in an external database 
which is specially designed to handle very big tables.> >> > My final objective 
is to be able to easily query a big external table (stored in another database) 
through the - excellent - sqlite interface.> >> > Now I have this terrible 
doubt: will the existing sqlite "limitations" for big sqlite tables also apply 
to my read-only virtual tables?> >> > Thus... am I currently losing my time 
developing such a "virtual table" with this objective in mind? Or is there a 
better way to achieve my objective?> >> > Thank you for your help!> >> >> > 
_________________________________________________________________> > Votre 
contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte gratuitement 
!> > http://www.windowslive.fr/hotmail/default.asp> > 
_______________________________________________> > sqlite-users mailing list> > 
sqlite-users@sqlite.org> > 
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users> >> 
_______________________________________________> sqlite-users mailing list> 
sqlite-users@sqlite.org> 
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_________________________________________________________________
Avec Hotmail, vos e-mails vous suivent partout ! Mettez Hotmail sur votre 
mobile !
http://www.messengersurvotremobile.com/?d=hotmail
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to