Re: [sqlite] large db performance considerations
Hi Simon, Thanks for your response. > You need an index, obviously. You can't have an index on a VIRTUAL > table. Yes, I did try on a regular table (not FTS3) and added the indices I described (one each on Author, Content and Category) and still did not notice any performance improvement. And the db size increased a fair amount. > Could you instead create a TEMPORARY table and read the data into > it ? That can have indices on all the fields you care about. You can > even copy all the entries from the VIRTUAL table into it in one > instruction using the form of INSERT that has a SELECT in it. It's > faster to load the data into the table first, then create the indices. That is a good idea. I will experiment with this. The database is basically read only. So I could create some other tables. > The other aspect of this is to wonder why you're using a VIRTUAL table > in the first place. Why aren't you searching the source data instead > of a virtual table ? The SELECT you described doesn't use the fts3 > feature. Yes, When the project began, I was planning on using full text search. As the project moved forward, I ended up not using it at this point, although future enhancements do call for using full text search. For this iteration, if there is a better way, without using FTS, I could certainly remove it and use a regular table. Thanks. Paul On Sun, Jun 7, 2009 at 11:00 PM, Simon Slavin <slav...@hearsay.demon.co.uk>wrote: > > On 8 Jun 2009, at 4:45am, Paul Perry wrote: > > > The latest content I have added, has over 31,000 records. A basic > > select, > > such as: > > > > Select Content from BookContent where DocumentID = 10; > > takes nearly 15 seconds. Are there any ways to optimize this? > > You need an index, obviously. You can't have an index on a VIRTUAL > table. > > Could you instead create a TEMPORARY table and read the data into > it ? That can have indices on all the fields you care about. You can > even copy all the entries from the VIRTUAL table into it in one > instruction using the form of INSERT that has a SELECT in it. It's > faster to load the data into the table first, then create the indices. > > The other aspect of this is to wonder why you're using a VIRTUAL table > in the first place. Why aren't you searching the source data instead > of a virtual table ? The SELECT you described doesn't use the fts3 > feature. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] large db performance considerations
Hello, I was wondering if anybody has a tip for optimizing the query time on a table. I have read several web pages about optimizing SQLite queries, but have not been able to come to any conclusions. I have a table: CREATE VIRTUAL TABLE BookContent using fts3( DocumentID INTEGER PRIMARY KEY, DocumentType INTEGER, ParentDocumentID INTEGER, Category TEXT, Content TEXT, Author TEXT, SrcFileIdent INTEGER, SrcLineNumber INTEGER); The key pieces of information are the Category, Content and Author. Performance is fine with up to 2000 records. I can do basic queries, and all works well. The latest content I have added, has over 31,000 records. A basic select, such as: Select Content from BookContent where DocumentID = 10; takes nearly 15 seconds. Are there any ways to optimize this? The first thing I did was consider made it a regular SQLite table (droped the FTS support), and used the same table schema, and added an index on Author, Content and Category. However, I did not notice any change in time, and the database size grew from 11MB to 15MB. Any suggestions would be greatly qppreciated. Thank you, Paul ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] First Program using sqlite3.dll - Linker Errors
Add the lib file you generated with lib.exe into your project. Just pointing to the directory is not enough. That should fix up your link errors. On Sat, Mar 28, 2009 at 10:53 AM, Neil Jefferywrote: > > Hi All, > > > > I am attempting to write a program which will use the sqlite3.dll file. I > am developing in Visual Studio 2008 Express on Windows Vista. > > > > I downloaded this file: http://www.sqlite.org/sqlitedll-3_6_11.zip which > contains the dll and a definitions file. I have then used lib.exe to > generate a.lib file. The lib file is currently in C:\SQLITE\. > > > > I have done the following to alter the Linker configuration in the project: > > > > 1)Right Click Project -> Properties > > 2)Configuration Properties -> Linker > > 3)I entered "C:\SQLITE\" in the "Additional Library Directories" property > field > > > > I believe that I need a sqlite3 header file in order to use the functions > in my code. This is not included in the file I downloaded above. I have > thus downloaded http://www.sqlite.org/sqlite-amalgamation-3_6_11.zip and > taken sqlite3.h from this and added it to my project. > > > > I have only one source file in the project at present which is as > http://www.sqlite.org/quickstart.html. Trying to build throws up compile > errors about the exit keyword, so I have added #include as well. > > > > Now the project compiles without warnings or errors, but I see the > following Linker Errors: > > > > 1>MAIN.obj : error LNK2019: unresolved external symbol _sqlite3_free > referenced in function _main > > 1>MAIN.obj : error LNK2019: unresolved external symbol _sqlite3_exec > referenced in function _main > > 1>MAIN.obj : error LNK2019: unresolved external symbol _sqlite3_close > referenced in function _main > > 1>MAIN.obj : error LNK2019: unresolved external symbol _sqlite3_errmsg > referenced in function _main > > 1>MAIN.obj : error LNK2019: unresolved external symbol _sqlite3_open > referenced in function _main > > 1>*PATH*\Projects\Testing\SQLITE1\Debug\SQLITE1.exe > : fatal error LNK1120: 5 unresolved externals > > > > I have never worked with DLLs before so I could be something quite > fundamental wrong. > > > > Any help or thoughts would be welcome. > > > > Many thanks, > > > > Neil > > _ > View your Twitter and Flickr updates from one place – Learn more! > http://clk.atdmt.com/UKM/go/137984870/direct/01/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Formatted text with fts3
Scott, Thanks so much for your thoughtful response. Things can get complicated (especially with full text searching) when using formatted text. What I really want to do is retain the formatting, to display in a HTML file. The text out of the database would be wrapped by a basic HTML outter document, and the formatted text would mainly reference style sheets. This is for a desktop application, so I am not trying to target a browser. I was thinking it is ok to limit myself to full open and close tags, and also to not have a single word be broken up, as you illustrated nicely. Thereby, I was thinking if I just filtered out (or removed) the basic tags I am using, such as: Some text. this is more text. Followed by more. by simply not putting the tags into the index (it technically wouldn't be fully formed xml with a header and such) , it would still retain the text for searching but ignore the tags. I agree that trying to handle generic HTML would be a bear. The content is not coming from the web, so I have control over that as well. I guess there could be problems with snippets or with the offsets of the text. I'll have to take a closer look into the provided parsers, and maybe there is a way that any tags, could just *not *be put into the index. I read the documents pointed out and the readme. It looks like most of the info on it is the source code, so I will need to take some time to dig deeper and understand it. I am hoping to start with the current source code and add a filter. I read that some folks from Google did some of the work on the fts3 module. They probably drop all tags when they search for text, and that allows for a more reliable search. I'm just guessing. Thanks again, Paul On Tue, Mar 10, 2009 at 11:22 AM, Scott Hess <sh...@google.com> wrote: > The fts module doesn't do anything "interesting" with embedded > meta-data in the interests of simplicity. Stripping the info out > before inserting is probably easiest, but has the downsides of > duplication (assuming you need to keep the raw data elsewhere), and it > means that queries involving snippets and the like may be funky. > Probably the best way to go about this would be to convert all tags to > a single whitespace character. > > Building a custom tokenizer is certainly doable, but could be a > frustrating goal if you intend to process generic HTML you find on the > web, just because of the number of heuristics you'll have to layer in. > As a first pass, you might just treat tags as word breaks as you > iterate over the input. But there definitely are cases where HTML > markup happens within words, so you might need something a bit more > sophisticated. There is some level of support for returning tokens > which are not literally present in the input. For instance, for the > input 'this' you could return 'this' and indicate that it > corresponds to 11 characters in the input, and everything should work. > I'm not sure anyone has ever exercised this aspect of things > strongly, though, so it's possible that things don't work as intended > when you do that. > > Before going either direction, you should probably sit down and figure > out what exactly you're going to do with the results you get from the > table. If you want to, say, present them on a web page, then your > problems are just beginning, because the tag nesting will open up > layout issues and security problems. It may be that thinking through > that part of the system will help you figure out an appropriate > approach for this part of the system (for instance, if you decide to > strip tags for other reasons, then it all becomes easy!). > > -scott > > > On Tue, Mar 10, 2009 at 6:56 AM, Paul Perry <paulmjpe...@gmail.com> wrote: > > Thank you for the pointers Alexandre and Alexey. > > > > I spent about 30 minutes looking into the parser, and it looks like it is > a > > possibility. I'll require a more in-depth understanding in order to do > > this. I would probably start with the simple parser, and go from there. > > > >> I think to prepare html before insert is more simple. You can transform > > > > html into "right" format for fts3 parser. > > I would actually like to retain the tagged (html) formatting in the > > database, thereby, when it is retrieved it can be displayed as rich text. > > > > Thanks, > > Paul > > > > > > > > On Tue, Mar 10, 2009 at 4:32 AM, Alexey Pechnikov < > pechni...@mobigroup.ru>wrote: > > > >> Hello! > >> > >> On Tuesday 10 March 2009 06:16:16 Alexandre Courbot wrote: > >> > Never did this myself, but I think you can do what you need by writing > >> > your own tokeni
Re: [sqlite] Formatted text with fts3
Thank you for the pointers Alexandre and Alexey. I spent about 30 minutes looking into the parser, and it looks like it is a possibility. I'll require a more in-depth understanding in order to do this. I would probably start with the simple parser, and go from there. > I think to prepare html before insert is more simple. You can transform > html into "right" format for fts3 parser. I would actually like to retain the tagged (html) formatting in the database, thereby, when it is retrieved it can be displayed as rich text. Thanks, Paul On Tue, Mar 10, 2009 at 4:32 AM, Alexey Pechnikovwrote: > Hello! > > On Tuesday 10 March 2009 06:16:16 Alexandre Courbot wrote: > > Never did this myself, but I think you can do what you need by writing > > your own tokenizer: > > > > > http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/fts3/README.tokenizers > > It's not good advice for a few documented module. > > I think to prepare html before insert is more simple. You can transform > html > into "right" format for fts3 parser. > > Best regards. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Formatted text with fts3
I've been looking into the full text search capabilities of SQLite and it looks like exactly what I need. Cool stuff, especially in the confines of all the other great features of SQLite. The question is if I could put formatted text in a text field, and not have that show up in a full text search.. I would be fine with basic tags, and would also be fine limiting it to only open and close, as in: and not use the somewhat shorter format (as appropriate) of . That is, if it makes it any easier. I have done some testing, (not surprisingly) and fts will return the text in the tag. Actually, I would probably be surprised if it did not, after-all that is what it is doing. I was wondering if there is anyway to manually remove tags from the virtual table somehow (maybe remove some rows from *table*_content, *table*_segdir, or *table*_segments). Alas, I looked into the content of those tables, and there is nothing too obvious that I could see to be done from just peering into them. If anybody has some ideas on this, it would be great. I could always provide formatting by having another table which would match up that document, along with a formatting code and the starting and ending character it applies to. That does get more complicated, especially when preparing the text. I would really prefer having the tags intermixed with the text, that is if I can prevent them from being in the full text search. Any ideas would be loved. Cheers, Paul ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users