On Fri, Apr 16, 2010 at 3:24 AM, Alexey Pechnikov <pechni...@mobigroup.ru> wrote: > And you can use my patches for zlib-compression for FTS3. I'm planning to make > the "fts3z" extension because I want to use as original FTS3 > as FTS3 with compression together.
Back when I was working up fts1, I experimented with compression and found it useful, but ran up against the problem of SQLite itself not having inbuilt support for compression. Bummer! Anyhow, having a distinct fts3z for compression would be sub-optimal, I think, because it would fall behind. Maybe you could implement it as a compile-time option to fts3.c which allows it to export both fts3 and fts3z? Anyhow, you may also wish to experiment with how intrusive it would be to add externally-specified processing functions to the virtual table. I'd imagine something like: CREATE VIRTUAL TABLE t USING fts3(STORE FUNCTION compress, RETRIEVE FUNCTION uncompress, title, body); the table would not be accessible if you tried to load it on a SQLite which didn't have the uncompress function, but that should quickly become obvious when you look at the schema. Another option would be like how REGEXP works: CREATE VIRTUAL TABLE t USING fts3(COMPRESSED, title, body); when COMPRESSED is specified, the select and update queries would include fts3_compress() and fts3_uncompress() calls. If the SQLite embedder has not defined those functions, then errors will be generated. I have no veto, here, but my preference would be the first version, where the specific functions are listed. The second version is easier to code, but it means that distinct implementations could find themselves unable to read each other's tables because they define fts3_*compress() differently. The first version _could_ have that problem, but at least allows for the possibility of not having it. Hmm. You could also define the function to take a flag to control compress/uncompress: CREATE VIRTUAL TABLE t USING fts3(STORE WITH storefn, title, body); where storefn(0, original) and storefn(1, compressed), or something like that. ----- Of course, here I'm ignoring the entire problem of separate compressors for the document data versus the index data, or separate compressors for different columns. I could imagine: CREATE VIRTUAL TABLE t USING fts3(title, body STORE WITH storefn); but at some point it just gets too hard to hold everything together. There's no per-column tokenizer, either :-). That level of configurability would probably be better served by refactoring fts to allow the index and data to be distinct. Then you could perhaps layer an fts index over a table with views and triggers to accomplish compression. [Note that the "STORE WITH" variant above could also be a route to this: storefn(table_name, column_name, in_out, data) then storefn() could do the conversion from "t" to "t_contents" and build the queries. I think performance might end up contrary to the goals of using compression, though :-).] Moving on, scott _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users