Here's a link to the source:

ftp://66.77.27.238/sourcecode/ddj/2007/0712.zip

BTW, I totally missed the new bitvec recently introduced in the latest
version to track dirty pages. This pretty much removes the previous
constraints I mentioned on large databases, reducing memory
consumption to proportional to transaction size (actual dirty pages).
So this would seem to raise the practical limits significantly,
perhaps into the TBs for many operations (e.g. SELECTs).

-- Mike

On Wed, Apr 2, 2008 at 3:57 AM, Aladdin Lampé <[EMAIL PROTECTED]> wrote:
>
>  Thanks a lot Mike for your detailed answer! I've just read your acticle.
>  Do you know where I can get the files "fs.c" and "example.c" you mention? 
> I've been on the FTP (available from www.ddj.com/code/) but I was not able to 
> locate any file related to your article :( Could you please send them to me 
> by email if they are no longer available online? (genio570 -AT- hotmail DOT 
> fr).
>
>  It seems like I will be able to further develop the system I described in my 
> previous post with the information you mentioned.
>
>  Another quick question: I planned to develop my own indexing algorithm on 
> the "external database" side. But do you think I could also consider (as 
> another option) using the "CREATE INDEX" sqlite command (involving of course 
> a complete scan of the external table's column and then storing the index 
> inside the sqlite file)? What would you recommend me to do?
>
>  In a more general way, I wonder which sqlite SQL commands can involve 
> virtual tables and which cannot or should not (for instance for performance 
> reasons, etc.).
>
>  Thanks again for your help,
>  aladdin> Date: Tue, 1 Apr 2008 20:17:54 -0500> From: [EMAIL PROTECTED]> To: 
> sqlite-users@sqlite.org> Subject: Re: [sqlite] Virtual table used to query 
> big external database> > The main reason why SQLite's practical limit is in 
> the 10s of GBs as> opposed to TBs (theoretical) is due to how it tracks dirty 
> pages. This> is described in the "Appropriate Uses" page> 
> (http://www.sqlite.org/whentouse.html) but I'll rehash it here for> 
> convenience. SQLite tracks dirty pages with a bitmap which is> allocated 
> before each transaction. The size of the bitmap is> proportional to the size 
> (not in rows but in pages) of the database> (256 bytes for every 1Mb of 
> database), so as the database grows, the> amount of memory allocated before 
> each transaction grows. When you get> into the GB range, you are starting to 
> allocate in the MB range of> dirty page map memory per transaction, which 
> starts to take its toll> on performance.> > I could be wrong, but from what I 
> know about virtual tables, there is> no such correlation between virtual 
> table size and the dirty page> bitmap, as SQLite has no idea how big a 
> virtual table is, nor does it> manage the data within the vtable. 
> Furthermore, all SQLite really does> in a SELECT statement on a vtable is 
> call your code to iterate over> it. So really the only performance issue is 
> how long it takes your> code to iterate over your vtable. Thus, your table 
> could be in the TB> range, and as long as you are fine with iterating over 
> its contents,> there is no additional performance issues to speak of. There 
> are ways> to implement virtual tables such that you can limit how much of 
> the> table is scanned for certain queries, avoiding having to scan the> whole 
> thing every time. I wrote an article that touches on this using> the match() 
> function. Its available online:> > http://www.ddj.com/database/202802959> > 
> IMO, virtual tables are one of the most powerful and unique features> of 
> SQLite. There is a bit of a learning curve, but it's amazing what> you can do 
> with them. It sounds like you going to have to iterate over> your external 
> table one way or the other. I see no reason why the> vtable approach would be 
> any slower than any other approach that> iterates over the data.> > Having 
> said that, while iterating over a large vtable is not a big> deal (as your 
> program will just step through it one row at a time),> you need to be careful 
> about getting too fancy with your SQL as you> may end up triggering a lot of 
> background IO. For example, if you tack> on an ORDER BY which sorts one of 
> the columns of your vtable, SQLite> will end up essentially copying the 
> vtable contents into a temporary> file and sorting it, which may or may not 
> be a strain on your system> depending on how big your table is (e.g. your 
> vtable is 30Gb and your> /tmp folder is on a 10Gb partition). So think 
> through what you are> doing when going beyond a simple SELECT * from 
> big_vtable.> > -- Mike> > On Tue, Apr 1, 2008 at 3:12 PM, Aladdin Lampé 
> <[EMAIL PROTECTED]> wrote:> >> > Hi all!> >> > Very often, when people ask 
> this list why they have trouble managing in sqlite a "big table" (50 million 
> lines or more than 10 Go), they are told that sqlite is an embedded database 
> and is not meant to be used for very big databases/tables.> >> > I'm 
> currently in the process of designing a specific, read-only, sqlite "virtual 
> table" in order to enable sqlite to access data stored in an external 
> database which is specially designed to handle very big tables.> >> > My 
> final objective is to be able to easily query a big external table (stored in 
> another database) through the - excellent - sqlite interface.> >> > Now I 
> have this terrible doubt: will the existing sqlite "limitations" for big 
> sqlite tables also apply to my read-only virtual tables?> >> > Thus... am I 
> currently losing my time developing such a "virtual table" with this 
> objective in mind? Or is there a better way to achieve my objective?> >> > 
> Thank you for your help!> >> >> > 
> _________________________________________________________________> > Votre 
> contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte 
> gratuitement !> > http://www.windowslive.fr/hotmail/default.asp> > 
> _______________________________________________> > sqlite-users mailing list> 
> > sqlite-users@sqlite.org> > 
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users> >> 
> _______________________________________________> sqlite-users mailing list> 
> sqlite-users@sqlite.org> 
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> _________________________________________________________________
>  Avec Hotmail, vos e-mails vous suivent partout ! Mettez Hotmail sur votre 
> mobile !
>  http://www.messengersurvotremobile.com/?d=hotmail
>
>
> _______________________________________________
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to