[sqlite] Parallel accessing sqlite files in Win
Hi, I use sqlite3_open_v2 with flag SQLITE_OPEN_READONLY to open an SQLite database. When different processes access the same file (read-only) in Win, can I be sure that there won't be any problems? Furthermore, I open another database with SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE. When different processes access the same database in in Win, are write operations synchronized automatically or do I need to care about synchronization myself? Background: My Win32 app uses SQLite3 to store data/settings. Since the beginning, I just restricted the app to one instance to avoid any problems with parallel access to the files. Now I want to remove this constraint and aks myself if I need to do anything else except removing the single-instance check. Thank you, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS3: Combining match-column, prefix query and phrase query
Hi, Is it somehow possible to combine these three things? (1) match column: ... MATCH 'column:foo' (2) prefix query: ... MATCH 'foo*' (3) phrase query: ... MATCH '"foo bar"' I think (1) and (2) is no problem. For example if I want to search all documents containing words beginning OR ending with "foo" and "bar" [1]: (content:foo* OR reverse:oof*) OR (content:bar* OR reverse:rab*) However, I did not manage to combine these with (3). Especially (1) and (3) would be useful. E.g.: (a) content:"foo bar" (b) "content:foo bar" (c) content:"foo bar*" (d) "content:foo bar*" but neither of these work :( The best would be if (c) would work ... Regards, Luke [1] I included a column "reverse" which contains the whole text in reversed order in order to emulate postfix search *foo via oof* ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS3 finds too much: Slash special meaning? Something else?
Am 03.09.2010 13:27, schrieb Dan Kennedy: > > On Sep 2, 2010, at 6:37 PM, Lukas Haase wrote: > >> Hi, >> >> I use FTS3 (SQLITE_ENABLE_FTS3) with enhanced query syntax >> (SQLITE_ENABLE_FTS3_PARENTHESIS). >> >> Now if I search for a string like '2002/91/AH' there are lots of items >> which do NOT contain this string. This is a query: >> >> SELECT rowid, content FROM fulltext WHERE content MATCH '2002/91/AH'; >> >> In my case, there are only 10 items which actually contain the string >> '2002/91/AH' but the query above gives me 162 (!) matches! >> >> I can not find any reason for this. Some of the topics contain >> "similar" >> strings like 2002/96/AH or even 94/31/EG. But in fact, these strings >> must not be matched :-( >> >> Does the slash have a special meaning in the query syntax? Does a >> query >> like 2002/91/AH have a special meaning? > > The '/' characters are serving as token separators. So > you are searching for (2002 OR 91 OR ah). If you enclose > the date in double quotes: > > ... MATCH '"2002/91/AH"' > > you will be searching for the phrase "2002 91 ah", which > is as close as you can get to what you want without writing > a custom tokenizer: > > http://www.sqlite.org/fts3.html#section_5_1 Oh great! Thank you for your hint! With double quotes it works as I would expect it. In general writing a custom tokenizer would not be a problem BUT in my case it is complicated because the database (including the fulltext table) is created on a different machine with SQLites packages from a distributor (Debian stable). The data is queried using a custom C++ application, so on client side I may alter the code. Until now I use the simple tokenizer. At first I thought I can make it work if I just remove the slash '/' from the separator list. Therefore I added the following line to simpleCreate: t->delim['/'] = 0; Now, nothing was found with content MATCH '2002/91/AH'; and first I did not understand why. After your explanation it is clear why: 2002, 91 and AH are separate tokens in my database and MATCH '"2002/91/AH'" would normally search for the tokens 2002, 91, AH next to each other. But after the patch above, 2002/91/AH is a single token which is not found. So my questions is: Is there any other way to achieve my goal? Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS3 finds too much: Slash special meaning? Something else?
Hi, I use FTS3 (SQLITE_ENABLE_FTS3) with enhanced query syntax (SQLITE_ENABLE_FTS3_PARENTHESIS). Now if I search for a string like '2002/91/AH' there are lots of items which do NOT contain this string. This is a query: SELECT rowid, content FROM fulltext WHERE content MATCH '2002/91/AH'; In my case, there are only 10 items which actually contain the string '2002/91/AH' but the query above gives me 162 (!) matches! I can not find any reason for this. Some of the topics contain "similar" strings like 2002/96/AH or even 94/31/EG. But in fact, these strings must not be matched :-( Does the slash have a special meaning in the query syntax? Does a query like 2002/91/AH have a special meaning? What else could be the reason and is there a way to prevent FTS to find this wrong entries? Thank you very much in advance! Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sizeof tables
Am 18.08.2010 16:09, schrieb Max Vlasov: > On Tue, Aug 17, 2010 at 4:28 PM, Lukas Haase<lukasha...@gmx.at> wrote: > >> Hi, >> >> My sqlite database is about 65 MB. The data is split into serval tables. >> >> Is there a way to enumerate the space requirements for each table so >> that I can see which tables are the memory consumers? >> >> > Look at this discussion: > http://www.mail-archive.com/sqlite-users@sqlite.org/msg53997.html > at least two variants of the solution there. Hi, Thank you very much! Do you have a different link? This one is "Not Found" :-( Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Sizeof tables
Hi, My sqlite database is about 65 MB. The data is split into serval tables. Is there a way to enumerate the space requirements for each table so that I can see which tables are the memory consumers? Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Understanding ICU collations: Where needed?
Hi, I have a database which is built on a Linux System with PHP and pdo_sqlite and used (read-only) on a Windows platform. When I use the ICU module and I create a collation, where is the collation "stored"? Do I have to call icu_create_collation everytime I start up the database? Or just one time? Or only when inserting data? What happens if I omit the call? This leads to the second question: If I only insert data on the Linux side and I have an index on the column using the ICU collation, do I need to have ICU support on client side? Thank you, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Loading FTS3 as module: SQL error: error during initialization:
Hi, I want to load FTS3 dynamically as a module. I compile it from the source with: gcc -c fts3.c gcc -c fts3_expr.c gcc -c fts3_hash.c gcc -c fts3_icu.c gcc -c fts3_porter.c gcc -c fts3_tokenizer1.c gcc -c fts3_tokenizer.c gcc -shared fts3_expr.o fts3_hash.o fts3_icu.o fts3.o fts3_porter.o \ fts3_tokenizer1.o fts3_tokenizer.o -o libSqliteFts3.so But then loading fails: sqlite> SELECT load_extension('./libSqliteFts3.so'); SQL error: error during initialization: sqlite> Does anybody have a hint? Background: I use Debian with PHP and PDO (php5-sqlite, libsqlite3-0) where FTS is built into the core. But I need a modified FTS: The module should ONLY save the index and not the data. So I modified the source so that the data is not inserted. I can't compile the whole system myself because of the many dependencies. So I want just to compile my own modified FTS3 (see above) and load it dynamically. Thanks in advance, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Compile sqlite with only ANSI support (no Unicode)
Singaravelu, Rajaram schrieb: > Hi, > > [...] > > In short, can anyone tell me if I can compile sqlite3 with only ANSI > support so that it works like the ANSI version of fopen(). If I understand you correctly you just care about the filename when you talk about UTF8?! In this case everything should be very easy. You can find the whole win-specific stuff in os_win.c. Concerning fopen: This call is not used, instead the Windows API function CreateFile is used (see function winOpen!). As I can see in the source, the W-prefixed call is used which is the Unicode-aware function. You could either try replacing CreateFileW with CreateFileA or convert the filename with something like MultiByteToWideChar. > Thanks for your help. > > -Rajaram Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Extending FTS
Hi, Thank you anybody for your replies and ideas to "FTS and postfix search". I thought a lot about it and came to the conclusion: In general it is not necessary for a fulltext system not find subwords. If it would be, then I either need no index (search through whole data) or put subwords into the index too. So if my documents would be English I would be perfectly finished, even better with the PORTER-Tokenizer. But unfortunately the language is German and there are words consisting of of other words (e.g. Telefonkabel = telephone cable). It is still a requirement finding the "Telefonkabel" as well when searching for "Kabel". Does anybody have an idea what would be the best approach? In my opinion, I have no chance except to split these words with a predefined dictionary (e.g. {"Telefonkabel"} will become {"telefon", "kabel", "telefonkabel"}. Even this is a challenge (the index-generation should not take too long). My idea now would be to extend the FTS in some way to a) Support splitting words with predefined dictonary b) maybe support for non-english (german) versions of the Porter Stemming algorithm. I have programming experience with C and C++ but no idea of SQLite. Where to begin? How easy would it be to implement this and how much time would it take? I also found [1]. This indexer seems to be more powerful than the builtin FTS. However, I can't find support for word-splitting too. Does anybody have experience with that indexer? Would it be simpler to extent this indexer? Maybe someone have already tested both...on which should I concentrate, which one is faster? Thank you again all, Luke [1] http://ft3.sourceforge.net/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS and postfix search
Hi Jim, and thank you for the great idea. But I thought it would be possible to search '*word*' - but this is not possible with this method either. Is there any chance for searching '*word*' quickly? Regards, Luke Jim Showalter schrieb: > You could store the words reversed (in addition to storing them in > forward order). Then like 'xxx%' would be fast. > > This would double your disk footprint, but could give you the search > performance you're looking for. > > If that's too goofy, you could create a table of all one, two, and > three-character word endings, and join to it from all of your words > (stored in forward order). Then search first for the primary key of > the word ending you want to search for, then search your words for > that key. > > Index the join. > > - Original Message - > From: "Lukas Haase" <lukasha...@gmx.at> > To: <sqlite-users@sqlite.org> > Sent: Wednesday, August 05, 2009 6:16 PM > Subject: Re: [sqlite] FTS and postfix search > > >> Wes Freeman schrieb: >>> I clearly am not in the right mindset to be answering list emails. >>> Please ignore my response (it's too late now)--back to my stressful >>> deadline. >> :-) >> >>> Strange that it's implemented for prefix and not postfix? >> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE >> 'xxx%' can be performed easy because only the beginning of words >> need to >> be compared. >> >> However, there /is/ a way to also do postfix searches. I have the >> *same* >> database in *.hlp format and with WinHelp it's possible to search >> '*otor' (and others) with almost zero CPU and time consumption. I'd >> be >> curious how they did this. >> >> For a solution for SQLite I would accept a small performance penalty >> in >> that case (but very few secs max); additionally I would also accept >> the >> index being bigger. >> >> Regards, >> Luke >> >>> Wes >>> >>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> >>> wrote: >>>> Wes Freeman schrieb: >>>>> Why not LIKE '%otor'? >>>> SELECT topic_title FROM topics >>>> WHERE topic LIKE '%otor%' >>>> ORDER BY topic_title ASC; >>>> >>>> This is very, very slow, especially on my > 100 MB database. >>>> "Realtime" >>>> search in the GUI is a requirement. This is exactly the reason why >>>> I >>>> want to use FTS instead of LIKE... >>>> >>>> Regards, >>>> Luke >>>> >>>>> Wes >>>>> >>>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> It's me again, sorry. The next big problem concerning FTS. I >>>>>> have the >>>>>> requirement to do postfix searches, like: >>>>>> >>>>>> SELECT topic_title FROM topics >>>>>> WHERE topic MATCH '*otor' >>>>>> ORDER BY topic_title ASC; >>>>>> >>>>>> should find Motor, motor, Monotor etc. But this does not seem to >>>>>> work. >>>>>> Is there any chance to get this working? >>>>>> >>>>>> Best regards, >>>>>> Luke >>>>>> >>>>>> ___ >>>>>> sqlite-users mailing list >>>>>> sqlite-users@sqlite.org >>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>>>>> >>>>> ___ >>>>> sqlite-users mailing list >>>>> sqlite-users@sqlite.org >>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>>>> >>>> ___ >>>> sqlite-users mailing list >>>> sqlite-users@sqlite.org >>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>>> >>> ___ >>> sqlite-users mailing list >>> sqlite-users@sqlite.org >>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS and postfix search
Roger Binns schrieb: > [...] > > Lukas Haase wrote: >> additionally I would also accept the index being bigger. > > You could have a second FTS table where you store the keywords in reverse > order :-) Hi, DAMN!! Damn, damn. Thank you for the great idea. But unfortunately I thought I could combine postfix and prefix. But with this method this is not possible either. For example, consider the german word "Telefonanschlusskabel" (consisting of "Telefon", "Anschluss", "Kabel"). In this case '*anschluss*' should also find "Telefonanschlusskabel". Or, to be more specific, any '*partial*' query should work. Actually I need to solve this problem somehow. One possibility would be to write a complete FTS system on my own but this is really not what I want... Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS and postfix search
Wes Freeman schrieb: > I clearly am not in the right mindset to be answering list emails. > Please ignore my response (it's too late now)--back to my stressful > deadline. :-) > Strange that it's implemented for prefix and not postfix? Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE 'xxx%' can be performed easy because only the beginning of words need to be compared. However, there /is/ a way to also do postfix searches. I have the *same* database in *.hlp format and with WinHelp it's possible to search '*otor' (and others) with almost zero CPU and time consumption. I'd be curious how they did this. For a solution for SQLite I would accept a small performance penalty in that case (but very few secs max); additionally I would also accept the index being bigger. Regards, Luke > Wes > > On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> wrote: >> Wes Freeman schrieb: >>> Why not LIKE '%otor'? >> SELECT topic_title FROM topics >> WHERE topic LIKE '%otor%' >> ORDER BY topic_title ASC; >> >> This is very, very slow, especially on my > 100 MB database. "Realtime" >> search in the GUI is a requirement. This is exactly the reason why I >> want to use FTS instead of LIKE... >> >> Regards, >> Luke >> >>> Wes >>> >>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> wrote: >>>> Hi, >>>> >>>> It's me again, sorry. The next big problem concerning FTS. I have the >>>> requirement to do postfix searches, like: >>>> >>>> SELECT topic_title FROM topics >>>> WHERE topic MATCH '*otor' >>>> ORDER BY topic_title ASC; >>>> >>>> should find Motor, motor, Monotor etc. But this does not seem to work. >>>> Is there any chance to get this working? >>>> >>>> Best regards, >>>> Luke >>>> >>>> ___ >>>> sqlite-users mailing list >>>> sqlite-users@sqlite.org >>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>>> >>> ___ >>> sqlite-users mailing list >>> sqlite-users@sqlite.org >>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >>> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS and postfix search
Wes Freeman schrieb: > Why not LIKE '%otor'? SELECT topic_title FROM topics WHERE topic LIKE '%otor%' ORDER BY topic_title ASC; This is very, very slow, especially on my > 100 MB database. "Realtime" search in the GUI is a requirement. This is exactly the reason why I want to use FTS instead of LIKE... Regards, Luke > > Wes > > On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> wrote: >> Hi, >> >> It's me again, sorry. The next big problem concerning FTS. I have the >> requirement to do postfix searches, like: >> >> SELECT topic_title FROM topics >> WHERE topic MATCH '*otor' >> ORDER BY topic_title ASC; >> >> should find Motor, motor, Monotor etc. But this does not seem to work. >> Is there any chance to get this working? >> >> Best regards, >> Luke >> >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS and postfix search
Hi, It's me again, sorry. The next big problem concerning FTS. I have the requirement to do postfix searches, like: SELECT topic_title FROM topics WHERE topic MATCH '*otor' ORDER BY topic_title ASC; should find Motor, motor, Monotor etc. But this does not seem to work. Is there any chance to get this working? Best regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] virtual tables may not be indexed
Hi list, I have a huge problem: A database with 2 HTML fragements should contain a fulltext index. For that reason I put all data into a virtual table: CREATE VIRTUAL TABLE topics USING fts3( topicID INTEGER, topic_title VARCHAR(200) COLLATE NOCASE, topic TEXT, TOKENIZE simple); topic contains the HTML fragments, topic_title the title and topicID is needed for locating a specific entry. Well, and that's actually the problem... Before (i.e. without FTS) I did: SELECT topic FROM topics WHERE topicID=9874; which was quite fast. Now this is very, very slow (a few seconds!). I guess this is because topicID is not a primary key any more and no index is defined. So I wanted to create an index but I got the error in the subject. Really big problem :-( But I really need a way to *quickly* locate an entry by its ID. I do NOT want to store the data twice :-( What I am doing wrong? Best regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] queries for a fulltext-engine
Igor Tandetnik schrieb: > "Lukas Haase" <lukasha...@gmx.at> wrote in > message news:gv9fcm$5r...@ger.gmane.org >> I have a database containing thousands of HTML pages ("topics"). There >> is a fulltext index for these topics. First there is a table >> containing all single words. Each word is identified by its >> "fulltextID": >> >> CREATE TABLE fulltext( >> fulltextID INTEGER PRIMARY KEY, >> word VARCHAR(100) COLLATE NOCASE >> ); >> >> Now there is a linking table between the words and the HTML pages >> (topics): >> >> CREATE TABLE topic_fulltext( >> topicID INTEGER, >> fulltextID INTEGER, >> PRIMARY KEY(topicID, fulltextID) >> ); >> >> Finding a topic containing a specific word is not too hard: >> >> SELECT topic_fulltext.topicID >> FROM fulltext >> JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID >> WHERE word LIKE 'Word%'; >> >> But now I want to be able to search with more complex queries. For >> example: >> >> * List all topics containing (word1 AND word2) > > You could do something like this: Oh, thank you, this seems more like what I am looking for :-) :-) > SELECT topic_fulltext.topicID FROM topic_fulltext > where exists (select 1 from fulltext > WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE > 'word1%') > and exists (select 1 from fulltext > WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE > 'word2%') Unfortunately this does not work completely :( In the first WHERE clause I restrict to entried containing only the "word1" (the resultset will contain only entries with "word1"). So the second WHERE clause will always fail as there are no rows with "word2" left. This seems to work only for my OR-requirement (topics containg either word1 OR word2) when replacing AND with OR. > SELECT topic_fulltext.topicID FROM topic_fulltext > where fulltextID in ( > select fulltextID from topic_fulltext where word LIKE 'word1%' > intersect > select fulltextID from topic_fulltext where word LIKE 'word2%'); Unfortunately this does not work either. The set is taken from the fullwords, i.e. the result of the inner SELECT clause will contain fulltextIDs. And they will obviosly never intersect. So I have the same problem as above: Replacing with UNION ALL yields my OR-requirement but I can't get working it with AND :-( > Test it, see which one works faster. The second one seems to be much faster. Though it's too slow (3s or so), but I hope I can tune up the query on the end... > [...] >> * List all topics containing (word1 AND word2 AND ... AND word10) >> * List all topics containing ((word1 OR word2) AND word3 OR word3) > > The approach above should work for any boolean combination. I have forgotten one third type: the NOT. E.g.: * List all topics containing (NOT(word1 OR word2) AND word3) But this is no hard requirement, but boolean AND and OR combinations are. Thank you for your approaches, I tried to get the AND and OR working with it but I still do not figure it out :-( By the way: If there is a better way to organize the index in the database: This would be no problem if the queries will get simpler and faster. (As long as the memeory requirement stays approx. the same) Thank you again Igor, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] queries for a fulltext-engine
Hi, Igor Tandetnik schrieb: > "Lukas Haase" <lukasha...@gmx.at> wrote in > message news:gv9fcm$5r...@ger.gmane.org >> I have a database containing thousands of HTML pages ("topics"). There >> is a fulltext index for these topics. First there is a table >> containing all single words. Each word is identified by its >> "fulltextID": > > You seem to be reinventing FTS: > > http://www.sqlite.org/cvstrac/wiki?p=FtsUsage Thank you, but unfortunately not. First, there are 2 topics, second, these topics are HTML code fragments and third, they are all compressed externally with gzip for saving memory. For that reason I built this index, listing all words for a specific topic... Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] queries for a fulltext-engine
Hi, At first, thank you for your answer! Simon Slavin schrieb: > On 23 May 2009, at 7:30pm, Lukas Haase wrote: >> SELECT topic_fulltext.topicID >> FROM fulltext >> JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID >> WHERE word LIKE 'Word%'; >> >> But now I want to be able to search with more complex queries. For >> example: >> >> * List all topics containing (word1 AND word2) >> * List all topics containing (word1 OR word2) > > It works perfectly to do things like > > WHERE (word LIKE 'Word1%') AND (word LIKE 'Word2%') > WHERE (word LIKE 'Word1%') OR (word LIKE 'Word2%') No, this is unfortunately not the case :-( I know the usage of LIKE, % and =. But the problem is here that SELECT topic_fulltext.topicID FROM fulltext JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID WHERE (word LIKE 'Word1%') AND (word LIKE 'Word2%'); would give me no results. And the one statement with WHERE (word LIKE 'Word1%') OR (word LIKE 'Word2%'); would work... And, I have forgotten some other search cirteria: * List all topics containing (word1 AND NOT word2) > [...] >> * List all topics containing (word1 AND word2 AND ... AND word10) >> * List all topics containing ((word1 OR word2) AND word3 OR word3) > > You could write something to transform those into the format I showed > above. And you could get the results directly using SELECT or use > CREATE VIEW to reflect them. This indeed no problem but the query does not work. The problem is that I have a list of words and then a table which topic contains which words. So if I want to know which *topics* are linked with * word1 OR word2 * word1 AND word2 * word1 AND NOT word2 I need a completely different query. > But there are other ways to do it that might be more efficient (i.e. > faster) or simpler to program and debug. For instance, if you have a > chain of conditions you could CREATE TEMPORARY a table to then use it > to accumulate (OR) or eliminate (AND) the pages you want using the > form of INSERT that takes a SELECT argument. Depending on the size > and shape of your database this may or may not be faster. Yes, the DB is very huge. There are 20k topics and the fulltext and topic_fulltext tables are approx. 50MB. > [...] > Simon. Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] queries for a fulltext-engine
Hi, Sorry for the subject - I just do not know for what to search of what to ask - I actually do not know where is exactly my problem :-( I have a database containing thousands of HTML pages ("topics"). There is a fulltext index for these topics. First there is a table containing all single words. Each word is identified by its "fulltextID": CREATE TABLE fulltext( fulltextID INTEGER PRIMARY KEY, word VARCHAR(100) COLLATE NOCASE ); Now there is a linking table between the words and the HTML pages (topics): CREATE TABLE topic_fulltext( topicID INTEGER, fulltextID INTEGER, PRIMARY KEY(topicID, fulltextID) ); Finding a topic containing a specific word is not too hard: SELECT topic_fulltext.topicID FROM fulltext JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID WHERE word LIKE 'Word%'; But now I want to be able to search with more complex queries. For example: * List all topics containing (word1 AND word2) * List all topics containing (word1 OR word2) * List all topics containing (word1 AND word2 AND ... AND word10) * List all topics containing ((word1 OR word2) AND word3 OR word3) * ... But now I have no clue how to accomplish this query. I just can't extend my single-word query from above :-( Can anyone give me a hint? Thank you very much, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Sort order with umlauts
Hello, For a database I also need a fulltext index. This is my table containing all the words: CREATE TABLE db_fulltext.fulltext( fulltextID INTEGER PRIMARY KEY, word VARCHAR(100) COLLATE NOCASE ); Now I have the problem that I have also words with umlauts. Now they are sorted this way: ua .. uz .. zz .. üa .. But I need the umlauts treated as their respective vovels, i.e.: ua üa .. uz .. zz Is this somehow possible? Greets, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Fast and simple database merge
Hi, For an application I use SQLite as datafile. I have written a "compiler" (script chain in Linux) for creating my database file. There are dependencies between tables and "compiling" my single database takes about 1-2 hours. When there is an error I have to restart the whole procedure. Very bad. In order to overcome this problem, I divided my script in small chunks and use "make". Each scripts takes now a few minutes and creates its own SQLite dat-file. When another script needs data from another file, it just uses "ATTACH DATABASE". Works fine. BUT: I end up with 10 files instead of one; all of them having their indices. But for my application I need one file. My question now is: Is there a simple, fast and efficient way to just merge these databases to a single file? The one solution I have is to recreate all tables (CREATE TABLE) in a new file and use INSERT INTO ... SELECT FROM (and again using ATTACH DATABASE) and after that to recreate each single index. But: a) This takes very long b) I have to write code for CREATE TABLE's twice c) I have to write code for CREATE INDEX's twice Really cool would be something like: cat db1.dat db2.dat db3.dat > final.dat ;-) Thank you in advance, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] onthefly Compression of SQLite database
Hi, I searched a little bit with google but I could not find any code or extensions to compress an SQLite database on the fly. Is this possible? I use SQLite to store a database engine with images and HTML code. I already use zlib to compress the data itself (i.e. the images and the HTML data). But my problem is that I also need a lot of meta tables and also a fulltext index. The fulltext index is about 50 MB itself! I can't compress these tables because the columns are all INTEGERS and small VARCHARs. In total, my SQLite database is about 100MB. Very, very huge. If I compress it using WinZIP or gzip, I get a size of just 50MB (a half!) although the html data/images were already compressed! It seems that compressing an SQLite database is very efficient. Is there any way to compress/decompress a whole database on the fly ? Thank you, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Stringcompare operator with COLLATE NOCASE
Igor Tandetnik schrieb: > "Lukas Haase" <lukasha...@gmx.at> wrote in > message news:gkd3f4$cl...@ger.gmane.org >> This is not what I want. So I use COLLATE NOCASE: >> >> This is the result I want. However, now I need a WHERE clause to work >> exactly the same. So I tried: >> >> SELECT keyword FROM keywords WHERE keyword < 'T*' ORDER BY keyword >> COLLATE NOCASE ASC; > > SELECT keyword FROM keywords > WHERE keyword < 'T*' COLLATE NOCASE > ORDER BY keyword COLLATE NOCASE ASC; > > Or, you could just specify the collation in the CREATE TABLE statement, > then it would be used by default both for comparisons and for sorting: > > CREATE TABLE keywords( > keywordID INTEGER PRIMARY KEY, > keyword VARCHAR(100) COLLATE NOCASE > ); Thank you very much! :-) Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Stringcompare operator with COLLATE NOCASE
Hello, I have a table containting keywords: CREATE TABLE keywords( keywordID INTEGER PRIMARY KEY, keyword VARCHAR(100) ); INSERT INTO keywords VALUES(1,'Apple'); INSERT INTO keywords VALUES(2,'apple'); INSERT INTO keywords VALUES(3,'Angle'); INSERT INTO keywords VALUES(4,'Tree'); INSERT INTO keywords VALUES(5,'tee'); Normally, they would not be case sensitive, i.e. they would sort: Angle Apple Tree apple tee This is not what I want. So I use COLLATE NOCASE: CREATE INDEX keyword ON keywords(keyword COLLATE NOCASE ASC); SELECT keyword FROM keywords ORDER BY keyword COLLATE NOCASE ASC; in order to obtain: Angle Apple apple tee Tree This is the result I want. However, now I need a WHERE clause to work exactly the same. So I tried: SELECT keyword FROM keywords WHERE keyword < 'T*' ORDER BY keyword COLLATE NOCASE ASC; Angle Apple But the output should be: Angle Apple apple tee (i.e. exactly as above but just all strings smaller). It seems to me that the string-compare operator (<) ignores the collation from my ORDER BY clause. Is there any way to obtain a valid result? Thank you, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Getting the "position" (like LIMIT) for a query
Lukas Haase schrieb: > Igor Tandetnik schrieb: >> "Lukas Haase" <lukasha...@gmx.at> wrote in >> message news:gkat07$n2...@ger.gmane.org >>> I use an SQLite database to fill a virtual list control in Windows. In >>> this control, I just tell the control the numer of my elements and the >>> control tells me for which range data is needed. > [...] > However, I solved this, even if not so performant, according to your link. > > The actual problem is the find-function: From my 1000-elements > resultset, I need to pass to Windows that index (between 1 and 1000!) > that fits best to the search criterion. Yeah, yeah, I think I had a flash of genius right now! Very simple, I don't know why I did not come to this earlier... SELECT COUNT(*) FROM keywords WHERE keyword < '%s*' ORDER BY keyword ASC; where %s is the string to be found :-) Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Getting the "position" (like LIMIT) for a query
Igor Tandetnik schrieb: > "Lukas Haase" <lukasha...@gmx.at> wrote in > message news:gkat07$n2...@ger.gmane.org >> I use an SQLite database to fill a virtual list control in Windows. In >> this control, I just tell the control the numer of my elements and the >> control tells me for which range data is needed. > > http://www.sqlite.org/cvstrac/wiki?p=ScrollingCursor Thank you, I did not know this. But unfortunately this does not help in my situation. I do not simply "page" up and down a fixed numer of rows. The problem is that *Windows* tells me which indexes need to be shown. This means: Windows just tells me: "From your 1000 elements, I want to have number 599". To get it a little bit better, Windows *additionally* tells me which elements I should cache (e.g. if possible, cache number 450 to 500). However, I solved this, even if not so performant, according to your link. The actual problem is the find-function: From my 1000-elements resultset, I need to pass to Windows that index (between 1 and 1000!) that fits best to the search criterion. And there I have exactly no clue how to do this. If I could add a kind of auto index from 1...1000 to my resultset, I could try something like this: SELECT AUTO_ID(), keyword, '%search' FROM keywords WHERE keyword = '%search' but the problem is the missing AUTO_ID() functionality. Is there any hope to successfully implement the search function? Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Getting the "position" (like LIMIT) for a query
Hello, I use an SQLite database to fill a virtual list control in Windows. In this control, I just tell the control the numer of my elements and the control tells me for which range data is needed. For example I set: SELECT COUNT(*) FROM keywords; to tell the control the numer of elements. Now, I get "caching messages" which tell me what range will be needed next, for example iFrom=500 and iTo=520. In this case I do just a: "SELECT keyword FROM keywords ORDER BY keyword LIMIT %d, %d", iFrom, iTo-iFrom. The result is stored into a std::map as a cache, and when requested, the appropriate element is displayed. So far, so good. But I have the problem that I want to be able to search in the list. In this case, Windows sends a message with "te" (if the string "te" should be found) and I need to hand back the *position* of the found data. I have no idea how to do this. In fact, I need to know the position (the same position that I would give the LIMIT statement) for a specific result set. I hope you can understand me. Is there a way to implement this? Thank you very much! Luke PS: The data may also be composed of a more complex statement (with JOINs). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 600ms for simple query: How to optimize it?
Kees Nuyt schrieb: > On Fri, 09 Jan 2009 21:16:03 +0100, Lukas Haase > <lukasha...@gmx.at> wrote in General Discussion of SQLite > Database <sqlite-users@sqlite.org>: > >> Hello Richard! >> >> Thank you very much!! It works! :-) >> >> >> Indeed. 0-10 milliseconds instead of 500-800 :-) >> >> But may you tell me why this works and where you have this information? >> I know the O-notation but I do not know /why/ this boosts down to log(n)... > > Use EXPLAIN SELECT . > to see the virtual machine instructions > and EXPLAIN SELECT QUERY PLAN . > to see which index is used. > > http://www.sqlite.org/lang_explain.html > > Each JOIN is implemented as nested loops. The virtual > machine code can tell a lot about what part of the database > has to be scanned. Thank you, I know this and I did try it already. But unfortunately I do not know how to interpret the results. Best regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 600ms for simple query: How to optimize it?
D. Richard Hipp schrieb: > On Jan 9, 2009, at 3:16 PM, Lukas Haase wrote: >> SELECT t.topic, t.length >> FROM printgroup AS pg1 >> LEFT JOIN printgroup AS pg2 ON pg1.printgroup = pg2.printgroup >> LEFT JOIN topics AS t ON t.topicID = pg2.topicID >> LEFT JOIN topic_ids AS ti ON ti.topicID = pg1.topicID >> WHERE ti.topic_textID = '' >> ORDER BY pg2.topicID ASC; > > You seem very fond of using LEFT JOINs in places where they do not > make good sense. Yes, I started with mySQL 3 many years ago. At the beginning I only knew about LEFT JOINs and used them. Now I think I also know the other types of JOINs but I still use LEFT JOINs very often, just by habit. And with mySQL I never had performance problems with them. > What is it that you think a LEFT JOIN does? (A LEFT JOIN B) joins together table A and B while all records are taken from A and only records that match both are takes from B. If a record from A has no corresponding data in B, the values are NULL. > How is > a LEFT JOIN different than an ordinary inner JOIN? INNER JOIN takes *all* records from both tables, A and B. Generally, the resultset will be larger. > I ask because I > suspect that your answer will reveal misconceptions about LEFT JOINs > which, when rectified, will cause most of your performance issues to > go away. Maybe my I think too much in "left joining" but I did not know that there is so much difference in performance. Best Regards, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] 600ms for simple query: How to optimize it?
Hello Richard! Thank you very much!! It works! :-) D. Richard Hipp schrieb: > On Jan 7, 2009, at 6:11 PM, Lukas Haase wrote: > >> Hello, >> >> Can somebody tell me why this (simple) query take so much time? This >> query does nothing more than querying a table and JOINing two other >> tables together. >> >> SELECT >> ti1.topicID AS topicID, >> ti2.topic_textID AS parent, >> n.level, >> n.level_order >> FROM navigation AS n >> LEFT JOIN topic_ids AS ti1 ON ti1.topicID = n.topicID >> LEFT JOIN topic_ids AS ti2 ON ti2.topicID = n.parent_topicID >> WHERE ti1.topic_textID = 'X'; > > SQLite should be running this query in O(NlogN). > > If you change the first LEFT JOIN to a plain old JOIN (which should > give equivalent results by virtue of the WHERE clause restricting > ti1.topic_textID to not be NULL) then it should run in O(logN) - much > faster. Try it and let me know. Indeed. 0-10 milliseconds instead of 500-800 :-) But may you tell me why this works and where you have this information? I know the O-notation but I do not know /why/ this boosts down to log(n)... I have other queries which worry me. But that trick did not help in these cases :-( Especially I have problems with a self-join. In a table I have defined groups of elements ("printgroup"): CREATE TABLE printgroup( topicID INTEGER, printgroup INTEGER, PRIMARY KEY(topicID, printgroup) ); I think these indices are not necessary because both fields are primary keys anyway. CREATE INDEX topicID ON printgroup(topicID); CREATE INDEX pprintgroup ON printgroup(printgroup); When I know one element of a group (given by topicID) I want to find all other elements in the same group: SELECT t.topic, t.length FROM printgroup AS pg1 LEFT JOIN printgroup AS pg2 ON pg1.printgroup = pg2.printgroup LEFT JOIN topics AS t ON t.topicID = pg2.topicID LEFT JOIN topic_ids AS ti ON ti.topicID = pg1.topicID WHERE ti.topic_textID = '' ORDER BY pg2.topicID ASC; The table "topics" just contains the actual data for each topicID (t.topic with length t.length). This query takes a few seconds (und to minutes) with "sqlite3.exe" and even much longer in my application (sqlite with CppSQlite3): Up to 15 minutes! Mimicking your magic above I tried to leave out the "LEFT" in the self-joins but it did not change anything :-( And unfortunately, the optimization FAQ [1] is very incomplete, at least at the interesting points (indices) :-( Thank you again and best regards, Luke [1] http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] 600ms for simple query: How to optimize it?
Hello, Can somebody tell me why this (simple) query take so much time? This query does nothing more than querying a table and JOINing two other tables together. SELECT ti1.topicID AS topicID, ti2.topic_textID AS parent, n.level, n.level_order FROM navigation AS n LEFT JOIN topic_ids AS ti1 ON ti1.topicID = n.topicID LEFT JOIN topic_ids AS ti2 ON ti2.topicID = n.parent_topicID WHERE ti1.topic_textID = 'X'; I thought I optimized the table good with indexes but one such a query takes 500 to 1000ms in my C++ program. Here are my table definitions and the indexes (unfortunately I need the VARCHAR(20) field because I get the "topicID" only as text: CREATE TABLE topic_ids( topicID INTEGER, topic_textID VARCHAR(20), PRIMARY KEY(topicID) ); CREATE INDEX topic_textID ON topic_ids(topic_textID); CREATE TABLE navigation( topicID INTEGER PRIMARY KEY, parent_topicID INTEGER, level VARCHAR(20), level_order INTEGER ); CREATE INDEX parent_topicID ON navigation(parent_topicID); CREATE INDEX level ON navigation(level); CREATE INDEX level_order ON navigation(level_order); I need to execute this query in a database application each time a new page is opened. So 500ms are really too much. A few ms would be great. And the tables itself are not really huge: SELECT COUNT(*) FROM navigation; 19469 SELECT COUNT(*) FROM topic_ids; 19469 Does anybody have an idea what's going wrong here? How can I speed up this query? Thank you very much in advance, Luke ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users