Simon, Dan thanks for your answers. Now it's clear the plain text can be (nearly) rebuilt and Dan's answer explains why the offsets() function has to access the original data.
I was definitely evaluating the database encryption but it has some significant drawbacks. Encryption by using (un)compress just brought some hope ;) Many thanks and have a nice day P On Tue, Sep 4, 2012 at 4:51 PM, Simon Slavin <[email protected]> wrote: > > On 4 Sep 2012, at 3:37pm, Dan Kennedy <[email protected]> wrote: > >> On 09/04/2012 07:51 PM, Pavel Hlavnicka wrote: >>> Dear all, >>> >>> we are using sqlite FTS4 to build a fulltext index on data which >>> should not be available to the user without a decryption inside the >>> application. FTS4 matches perfectly - we can use either contentless >>> database or compress/uncompress parameters to encrypt the plain text >>> data. >>> >>> My question is if the advanced user could be able to rebuild the plain >>> text data just from the full text index. [snip] >> >> The offsets stored in the full-text index are measured in tokens, >> not bytes or characters. [snip] > > So it is possible to reconstruct the original fulltext from the token data as > long as you don't mind missing (a) stopwords and (b) some punctuation and > spacing. > > Pavel, unfortunately I can only recommend you use an encrypted database to > store your FTS4 information. Take a look at > > <http://www.hwaci.com/sw/sqlite/see.html> > > Simon. > _______________________________________________ > sqlite-users mailing list > [email protected] > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users -- Pavel Hlavnicka - think when printing save the trees be carbon neutral at least _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

