[sqlite] Parallel accessing sqlite files in Win

2013-06-22 Thread Lukas Haase
Hi,

I use sqlite3_open_v2 with flag SQLITE_OPEN_READONLY to open an SQLite
database. When different processes access the same file (read-only) in
Win, can I be sure that there won't be any problems?

Furthermore, I open another database with SQLITE_OPEN_READWRITE |
SQLITE_OPEN_CREATE. When different processes access the same database in
in Win, are write operations synchronized automatically or do I need to
care about synchronization myself?

Background: My Win32 app uses SQLite3 to store data/settings. Since the
beginning, I just restricted the app to one instance to avoid any
problems with parallel access to the files. Now I want to remove this
constraint and aks myself if I need to do anything else except removing
the single-instance check.

Thank you,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS3: Combining match-column, prefix query and phrase query

2010-09-04 Thread Lukas Haase
Hi,

Is it somehow possible to combine these three things?

(1) match column: ... MATCH 'column:foo'
(2) prefix query: ... MATCH 'foo*'
(3) phrase query: ... MATCH '"foo bar"'

I think (1) and (2) is no problem. For example if I want to search all 
documents containing words beginning OR ending with "foo" and "bar" [1]:

   (content:foo* OR reverse:oof*) OR (content:bar* OR reverse:rab*)

However, I did not manage to combine these with (3). Especially (1) and 
(3) would be useful. E.g.:

   (a)  content:"foo bar"
   (b)  "content:foo bar"
   (c)  content:"foo bar*"
   (d)  "content:foo bar*"

but neither of these work :(

The best would be if (c) would work ...

Regards, Luke





[1] I included a column "reverse" which contains the whole text in 
reversed order in order to emulate postfix search *foo via oof*


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS3 finds too much: Slash special meaning? Something else?

2010-09-04 Thread Lukas Haase
Am 03.09.2010 13:27, schrieb Dan Kennedy:
>
> On Sep 2, 2010, at 6:37 PM, Lukas Haase wrote:
>
>> Hi,
>>
>> I use FTS3 (SQLITE_ENABLE_FTS3) with enhanced query syntax
>> (SQLITE_ENABLE_FTS3_PARENTHESIS).
>>
>> Now if I search for a string like '2002/91/AH' there are lots of items
>> which do NOT contain this string. This is a query:
>>
>> SELECT rowid, content FROM fulltext WHERE content MATCH '2002/91/AH';
>>
>> In my case, there are only 10 items which actually contain the string
>> '2002/91/AH' but the query above gives me 162 (!) matches!
>>
>> I can not find any reason for this. Some of the topics contain
>> "similar"
>> strings like 2002/96/AH or even 94/31/EG. But in fact, these strings
>> must not be matched :-(
>>
>> Does the slash have a special meaning in the query syntax? Does a
>> query
>> like 2002/91/AH have a special meaning?
>
> The '/' characters are serving as token separators. So
> you are searching for (2002 OR 91 OR ah). If you enclose
> the date in double quotes:
>
> ... MATCH '"2002/91/AH"'
>
> you will be searching for the phrase "2002 91 ah", which
> is as close as you can get to what you want without writing
> a custom tokenizer:
>
> http://www.sqlite.org/fts3.html#section_5_1

Oh great! Thank you for your hint! With double quotes it works as I 
would expect it.

In general writing a custom tokenizer would not be a problem BUT in my 
case it is complicated because the database (including the fulltext 
table) is created on a different machine with SQLites packages from a 
distributor (Debian stable).

The data is queried using a custom C++ application, so on client side I 
may alter the code.

Until now I use the simple tokenizer. At first I thought I can make it 
work if I just remove the slash '/' from the separator list. Therefore I 
added the following line to simpleCreate:

t->delim['/'] = 0;

Now, nothing was found with content MATCH '2002/91/AH'; and first I did 
not understand why.

After your explanation it is clear why: 2002, 91 and AH are separate 
tokens in my database and MATCH '"2002/91/AH'" would normally search for 
the tokens 2002, 91, AH next to each other. But after the patch above, 
2002/91/AH is a single token which is not found.

So my questions is: Is there any other way to achieve my goal?

Regards,
Luke


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS3 finds too much: Slash special meaning? Something else?

2010-09-03 Thread Lukas Haase
Hi,

I use FTS3 (SQLITE_ENABLE_FTS3) with enhanced query syntax 
(SQLITE_ENABLE_FTS3_PARENTHESIS).

Now if I search for a string like '2002/91/AH' there are lots of items 
which do NOT contain this string. This is a query:

SELECT rowid, content FROM fulltext WHERE content MATCH '2002/91/AH';

In my case, there are only 10 items which actually contain the string 
'2002/91/AH' but the query above gives me 162 (!) matches!

I can not find any reason for this. Some of the topics contain "similar" 
strings like 2002/96/AH or even 94/31/EG. But in fact, these strings 
must not be matched :-(

Does the slash have a special meaning in the query syntax? Does a query 
like 2002/91/AH have a special meaning?

What else could be the reason and is there a way to prevent FTS to find 
this wrong entries?

Thank you very much in advance!

Regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sizeof tables

2010-08-20 Thread Lukas Haase
Am 18.08.2010 16:09, schrieb Max Vlasov:
> On Tue, Aug 17, 2010 at 4:28 PM, Lukas Haase<lukasha...@gmx.at>  wrote:
>
>> Hi,
>>
>> My sqlite database is about 65 MB. The data is split into serval tables.
>>
>> Is there a way to enumerate the space requirements for each table so
>> that I can see which tables are the memory consumers?
>>
>>
> Look at this discussion:
> http://www.mail-archive.com/sqlite-users@sqlite.org/msg53997.html
> at least two variants of the solution there.

Hi,

Thank you very much!

Do you have a different link?

This one is "Not Found" :-(

Regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sizeof tables

2010-08-18 Thread Lukas Haase
Hi,

My sqlite database is about 65 MB. The data is split into serval tables.

Is there a way to enumerate the space requirements for each table so 
that I can see which tables are the memory consumers?

Regards,
  Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Understanding ICU collations: Where needed?

2009-09-19 Thread Lukas Haase
Hi,

I have a database which is built on a Linux System with PHP and 
pdo_sqlite and used (read-only) on a Windows platform.

When I use the ICU module and I create a collation, where is the 
collation "stored"? Do I have to call icu_create_collation everytime I 
start up the database? Or just one time? Or only when inserting data? 
What happens if I omit the call?

This leads to the second question: If I only insert data on the Linux 
side and I have an index on the column using the ICU collation, do I 
need to have ICU support on client side?

Thank you,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Loading FTS3 as module: SQL error: error during initialization:

2009-09-05 Thread Lukas Haase
Hi,

I want to load FTS3 dynamically as a module. I compile it from the 
source with:

gcc -c fts3.c
gcc -c fts3_expr.c
gcc -c fts3_hash.c
gcc -c fts3_icu.c
gcc -c fts3_porter.c
gcc -c fts3_tokenizer1.c
gcc -c fts3_tokenizer.c
gcc -shared fts3_expr.o fts3_hash.o fts3_icu.o fts3.o fts3_porter.o \
   fts3_tokenizer1.o  fts3_tokenizer.o -o libSqliteFts3.so

But then loading fails:

sqlite> SELECT load_extension('./libSqliteFts3.so');
SQL error: error during initialization:
sqlite>

Does anybody have a hint?

Background: I use Debian with PHP and PDO (php5-sqlite, libsqlite3-0) 
where FTS is built into the core. But I need a modified FTS: The module 
should ONLY save the index and not the data. So I modified the source so 
that the data is not inserted. I can't compile the whole system myself 
because of the many dependencies. So I want just to compile my own 
modified FTS3 (see above) and load it dynamically.

Thanks in advance,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compile sqlite with only ANSI support (no Unicode)

2009-08-07 Thread Lukas Haase
Singaravelu, Rajaram schrieb:
> Hi,
> 
> [...]
> 
> In short, can anyone tell me if I can compile sqlite3 with only ANSI
> support so that it works like the ANSI version of fopen().

If I understand you correctly you just care about the filename when you 
talk about UTF8?!

In this case everything should be very easy. You can find the whole 
win-specific stuff in os_win.c. Concerning fopen: This call is not used, 
instead the Windows API function CreateFile is used (see function 
winOpen!). As I can see in the source, the W-prefixed call is used which 
is the Unicode-aware function.

You could either try replacing CreateFileW with CreateFileA or convert 
the filename with something like MultiByteToWideChar.

> Thanks for your help.
> 
> -Rajaram

Regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Extending FTS

2009-08-07 Thread Lukas Haase
Hi,

Thank you anybody for your replies and ideas to "FTS and postfix 
search". I thought a lot about it and came to the conclusion: In general 
it is not necessary for a fulltext system not find subwords. If it would 
be, then I either need no index (search through whole data) or put 
subwords into the index too.

So if my documents would be English I would be perfectly finished, even 
better with the PORTER-Tokenizer.

But unfortunately the language is German and there are words consisting 
of of other words (e.g. Telefonkabel = telephone cable). It is still a 
requirement finding the "Telefonkabel" as well when searching for 
"Kabel". Does anybody have an idea what would be the best approach? In 
my opinion, I have no chance except to split these words with a 
predefined dictionary (e.g. {"Telefonkabel"} will become {"telefon", 
"kabel", "telefonkabel"}. Even this is a challenge (the index-generation 
should not take too long). My idea now would be to extend the FTS in 
some way to
a) Support splitting words with predefined dictonary
b) maybe support for non-english (german) versions of the Porter 
Stemming algorithm.

I have programming experience with C and C++ but no idea of SQLite. 
Where to begin? How easy would it be to implement this and how much time 
would it take?

I also found [1]. This indexer seems to be more powerful than the 
builtin FTS. However, I can't find support for word-splitting too. Does 
anybody have experience with that indexer? Would it be simpler to extent 
this indexer? Maybe someone have already tested both...on which should I 
concentrate, which one is faster?

Thank you again all,

Luke

[1] http://ft3.sourceforge.net/

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-06 Thread Lukas Haase
Hi Jim,

and thank you for the great idea. But I thought it would be possible to 
search '*word*' - but this is not possible with this method either.

Is there any chance for searching '*word*' quickly?

Regards,
Luke

Jim Showalter schrieb:
> You could store the words reversed (in addition to storing them in 
> forward order). Then like 'xxx%' would be fast.
> 
> This would double your disk footprint, but could give you the search 
> performance you're looking for.
> 
> If that's too goofy, you could create a table of all one, two, and 
> three-character word endings, and join to it from all of your words 
> (stored in forward order). Then search first for the primary key of 
> the word ending you want to search for, then search your words for 
> that key.
> 
> Index the join.
> 
> - Original Message - 
> From: "Lukas Haase" <lukasha...@gmx.at>
> To: <sqlite-users@sqlite.org>
> Sent: Wednesday, August 05, 2009 6:16 PM
> Subject: Re: [sqlite] FTS and postfix search
> 
> 
>> Wes Freeman schrieb:
>>> I clearly am not in the right mindset to be answering list emails.
>>> Please ignore my response (it's too late now)--back to my stressful
>>> deadline.
>> :-)
>>
>>> Strange that it's implemented for prefix and not postfix?
>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE
>> 'xxx%' can be performed easy because only the beginning of words 
>> need to
>> be compared.
>>
>> However, there /is/ a way to also do postfix searches. I have the 
>> *same*
>> database in *.hlp format and with WinHelp it's possible to search
>> '*otor' (and others) with almost zero CPU and time consumption. I'd 
>> be
>> curious how they did this.
>>
>> For a solution for SQLite I would accept a small performance penalty 
>> in
>> that case (but very few secs max); additionally I would also accept 
>> the
>> index being bigger.
>>
>> Regards,
>> Luke
>>
>>> Wes
>>>
>>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> 
>>> wrote:
>>>> Wes Freeman schrieb:
>>>>> Why not LIKE '%otor'?
>>>> SELECT topic_title FROM topics
>>>> WHERE topic LIKE '%otor%'
>>>> ORDER BY topic_title ASC;
>>>>
>>>> This is very, very slow, especially on my > 100 MB database. 
>>>> "Realtime"
>>>> search in the GUI is a requirement. This is exactly the reason why 
>>>> I
>>>> want to use FTS instead of LIKE...
>>>>
>>>> Regards,
>>>> Luke
>>>>
>>>>> Wes
>>>>>
>>>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> 
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> It's me again, sorry. The next big problem concerning FTS. I 
>>>>>> have the
>>>>>> requirement to do postfix searches, like:
>>>>>>
>>>>>> SELECT topic_title FROM topics
>>>>>> WHERE topic MATCH '*otor'
>>>>>> ORDER BY topic_title ASC;
>>>>>>
>>>>>> should find Motor, motor, Monotor etc. But this does not seem to 
>>>>>> work.
>>>>>> Is there any chance to get this working?
>>>>>>
>>>>>> Best regards,
>>>>>> Luke
>>>>>>
>>>>>> ___
>>>>>> sqlite-users mailing list
>>>>>> sqlite-users@sqlite.org
>>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>>
>>>>> ___
>>>>> sqlite-users mailing list
>>>>> sqlite-users@sqlite.org
>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-06 Thread Lukas Haase
Roger Binns schrieb:
> [...]
> 
> Lukas Haase wrote:
>> additionally I would also accept the index being bigger.
> 
> You could have a second FTS table where you store the keywords in reverse
> order :-)

Hi,

DAMN!! Damn, damn. Thank you for the great idea. But unfortunately I 
thought I could combine postfix and prefix. But with this method this is 
not possible either. For example, consider the german word 
"Telefonanschlusskabel" (consisting of "Telefon", "Anschluss", "Kabel").

In this case '*anschluss*' should also find "Telefonanschlusskabel".

Or, to be more specific, any '*partial*' query should work.

Actually I need to solve this problem somehow. One possibility would be 
to write a complete FTS system on my own but this is really not what I 
want...

Regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Wes Freeman schrieb:
> I clearly am not in the right mindset to be answering list emails.
> Please ignore my response (it's too late now)--back to my stressful
> deadline.

:-)

> Strange that it's implemented for prefix and not postfix?

Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE 
'xxx%' can be performed easy because only the beginning of words need to 
be compared.

However, there /is/ a way to also do postfix searches. I have the *same* 
database in *.hlp format and with WinHelp it's possible to search 
'*otor' (and others) with almost zero CPU and time consumption. I'd be 
curious how they did this.

For a solution for SQLite I would accept a small performance penalty in 
that case (but very few secs max); additionally I would also accept the 
index being bigger.

Regards,
Luke

> Wes
> 
> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> wrote:
>> Wes Freeman schrieb:
>>> Why not LIKE '%otor'?
>> SELECT topic_title FROM topics
>> WHERE topic LIKE '%otor%'
>> ORDER BY topic_title ASC;
>>
>> This is very, very slow, especially on my > 100 MB database. "Realtime"
>> search in the GUI is a requirement. This is exactly the reason why I
>> want to use FTS instead of LIKE...
>>
>> Regards,
>> Luke
>>
>>> Wes
>>>
>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> wrote:
>>>> Hi,
>>>>
>>>> It's me again, sorry. The next big problem concerning FTS. I have the
>>>> requirement to do postfix searches, like:
>>>>
>>>> SELECT topic_title FROM topics
>>>> WHERE topic MATCH '*otor'
>>>> ORDER BY topic_title ASC;
>>>>
>>>> should find Motor, motor, Monotor etc. But this does not seem to work.
>>>> Is there any chance to get this working?
>>>>
>>>> Best regards,
>>>> Luke
>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Wes Freeman schrieb:
> Why not LIKE '%otor'?

SELECT topic_title FROM topics
WHERE topic LIKE '%otor%'
ORDER BY topic_title ASC;

This is very, very slow, especially on my > 100 MB database. "Realtime" 
search in the GUI is a requirement. This is exactly the reason why I 
want to use FTS instead of LIKE...

Regards,
Luke

> 
> Wes
> 
> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> wrote:
>> Hi,
>>
>> It's me again, sorry. The next big problem concerning FTS. I have the
>> requirement to do postfix searches, like:
>>
>> SELECT topic_title FROM topics
>> WHERE topic MATCH '*otor'
>> ORDER BY topic_title ASC;
>>
>> should find Motor, motor, Monotor etc. But this does not seem to work.
>> Is there any chance to get this working?
>>
>> Best regards,
>> Luke
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Hi,

It's me again, sorry. The next big problem concerning FTS. I have the 
requirement to do postfix searches, like:

SELECT topic_title FROM topics
WHERE topic MATCH '*otor'
ORDER BY topic_title ASC;

should find Motor, motor, Monotor etc. But this does not seem to work. 
Is there any chance to get this working?

Best regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] virtual tables may not be indexed

2009-08-05 Thread Lukas Haase
Hi list,

I have a huge problem: A database with 2 HTML fragements should 
contain a fulltext index. For that reason I put all data into a virtual 
table:

CREATE VIRTUAL TABLE topics USING fts3(
topicID INTEGER,
topic_title VARCHAR(200) COLLATE NOCASE,
topic TEXT,
TOKENIZE simple);

topic contains the HTML fragments, topic_title the title and topicID is 
needed for locating a specific entry. Well, and that's actually the 
problem... Before (i.e. without FTS) I did:

SELECT topic FROM topics WHERE topicID=9874;

which was quite fast. Now this is very, very slow (a few seconds!). I 
guess this is because topicID is not a primary key any more and no index 
is defined. So I wanted to create an index but I got the error in the 
subject. Really big problem :-( But I really need a way to *quickly* 
locate an entry by its ID. I do NOT want to store the data twice :-(

What I am doing wrong?

Best regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] queries for a fulltext-engine

2009-05-25 Thread Lukas Haase
Igor Tandetnik schrieb:
> "Lukas Haase" <lukasha...@gmx.at> wrote in
> message news:gv9fcm$5r...@ger.gmane.org
>> I have a database containing thousands of HTML pages ("topics"). There
>> is a fulltext index for these topics. First there is a table
>> containing all single words. Each word is identified by its
>> "fulltextID":
>>
>> CREATE TABLE fulltext(
>> fulltextID INTEGER PRIMARY KEY,
>> word VARCHAR(100) COLLATE NOCASE
>> );
>>
>> Now there is a linking table between the words and the HTML pages
>> (topics):
>>
>> CREATE TABLE topic_fulltext(
>> topicID INTEGER,
>> fulltextID INTEGER,
>> PRIMARY KEY(topicID, fulltextID)
>> );
>>
>> Finding a topic containing a specific word is not too hard:
>>
>> SELECT topic_fulltext.topicID
>> FROM fulltext
>> JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID
>> WHERE word LIKE 'Word%';
>>
>> But now I want to be able to search with more complex queries. For
>> example:
>>
>> * List all topics containing (word1 AND word2)
> 
> You could do something like this:

Oh, thank you, this seems more like what I am looking for :-) :-)

> SELECT topic_fulltext.topicID FROM topic_fulltext
> where exists (select 1 from fulltext
> WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE 
> 'word1%')
> and exists (select 1 from fulltext
> WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE 
> 'word2%')

Unfortunately this does not work completely :(

In the first WHERE clause I restrict to entried containing only the 
"word1" (the resultset will contain only entries with "word1"). So the 
second WHERE clause will always fail as there are no rows with "word2" left.

This seems to work only for my OR-requirement (topics containg either 
word1 OR word2) when replacing AND with OR.

> SELECT topic_fulltext.topicID FROM topic_fulltext
> where fulltextID in (
> select fulltextID from topic_fulltext where word LIKE 'word1%'
> intersect
> select fulltextID from topic_fulltext where word LIKE 'word2%');

Unfortunately this does not work either.

The set is taken from the fullwords, i.e. the result of the inner SELECT 
clause will contain fulltextIDs. And they will obviosly never intersect.

So I have the same problem as above: Replacing with UNION ALL yields my 
OR-requirement but I can't get working it with AND :-(

> Test it, see which one works faster.

The second one seems to be much faster. Though it's too slow (3s or so), 
but I hope I can tune up the query on the end...

> [...]
>> * List all topics containing (word1 AND word2 AND ... AND word10)
>> * List all topics containing ((word1 OR word2) AND word3 OR word3)
> 
> The approach above should work for any boolean combination.

I have forgotten one third type: the NOT. E.g.:

* List all topics containing (NOT(word1 OR word2) AND word3)

But this is no hard requirement, but boolean AND and OR combinations are.

Thank you for your approaches, I tried to get the AND and OR working 
with it but I still do not figure it out :-(

By the way: If there is a better way to organize the index in the 
database: This would be no problem if the queries will get simpler and 
faster. (As long as the memeory requirement stays approx. the same)

Thank you again Igor,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] queries for a fulltext-engine

2009-05-24 Thread Lukas Haase
Hi,

Igor Tandetnik schrieb:
> "Lukas Haase" <lukasha...@gmx.at> wrote in
> message news:gv9fcm$5r...@ger.gmane.org
>> I have a database containing thousands of HTML pages ("topics"). There
>> is a fulltext index for these topics. First there is a table
>> containing all single words. Each word is identified by its
>> "fulltextID":
> 
> You seem to be reinventing FTS:
> 
> http://www.sqlite.org/cvstrac/wiki?p=FtsUsage

Thank you, but unfortunately not.

First, there are 2 topics, second, these topics are HTML code 
fragments and third, they are all compressed externally with gzip for 
saving memory.

For that reason I built this index, listing all words for a
specific topic...

Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] queries for a fulltext-engine

2009-05-24 Thread Lukas Haase
Hi,

At first, thank you for your answer!

Simon Slavin schrieb:
> On 23 May 2009, at 7:30pm, Lukas Haase wrote:
>> SELECT topic_fulltext.topicID
>> FROM fulltext
>> JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID
>> WHERE word LIKE 'Word%';
>>
>> But now I want to be able to search with more complex queries. For  
>> example:
>>
>> * List all topics containing (word1 AND word2)
>> * List all topics containing (word1 OR word2)
> 
> It works perfectly to do things like
> 
> WHERE (word LIKE 'Word1%') AND (word LIKE 'Word2%')
> WHERE (word LIKE 'Word1%') OR (word LIKE 'Word2%')

No, this is unfortunately not the case :-( I know the usage of LIKE, % 
and =.

But the problem is here that

SELECT topic_fulltext.topicID
FROM fulltext
JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID
WHERE (word LIKE 'Word1%') AND (word LIKE 'Word2%');

would give me no results.

And the one statement with

WHERE (word LIKE 'Word1%') OR (word LIKE 'Word2%');

would work...

And, I have forgotten some other search cirteria:

* List all topics containing (word1 AND NOT word2)

> [...]
>> * List all topics containing (word1 AND word2 AND ... AND word10)
>> * List all topics containing ((word1 OR word2) AND word3 OR word3)
> 
> You could write something to transform those into the format I showed  
> above.  And you could get the results directly using SELECT or use  
> CREATE VIEW to reflect them.

This indeed no problem but the query does not work. The problem is that 
I have a list of words and then a table which topic contains which 
words. So if I want to know which *topics* are linked with

* word1 OR word2
* word1 AND word2
* word1 AND NOT word2

I need a completely different query.

> But there are other ways to do it that might be more efficient (i.e.  
> faster) or simpler to program and debug.  For instance, if you have a  
> chain of conditions you could CREATE TEMPORARY a table to then use it  
> to accumulate (OR) or eliminate (AND) the pages you want using the  
> form of INSERT that takes a SELECT argument.  Depending on the size  
> and shape of your database this may or may not be faster.

Yes, the DB is very huge. There are 20k topics and the fulltext and 
topic_fulltext tables are approx. 50MB.

> [...]
> Simon.

Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] queries for a fulltext-engine

2009-05-23 Thread Lukas Haase
Hi,

Sorry for the subject - I just do not know for what to search of what to 
ask - I actually do not know where is exactly my problem :-(

I have a database containing thousands of HTML pages ("topics"). There 
is a fulltext index for these topics. First there is a table containing 
all single words. Each word is identified by its "fulltextID":

CREATE TABLE fulltext(
fulltextID INTEGER PRIMARY KEY,
word VARCHAR(100) COLLATE NOCASE
);

Now there is a linking table between the words and the HTML pages (topics):

CREATE TABLE topic_fulltext(
topicID INTEGER,
fulltextID INTEGER,
PRIMARY KEY(topicID, fulltextID)
);

Finding a topic containing a specific word is not too hard:

SELECT topic_fulltext.topicID
FROM fulltext
JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID
WHERE word LIKE 'Word%';

But now I want to be able to search with more complex queries. For example:

* List all topics containing (word1 AND word2)
* List all topics containing (word1 OR word2)
* List all topics containing (word1 AND word2 AND ... AND word10)
* List all topics containing ((word1 OR word2) AND word3 OR word3)
* ...

But now I have no clue how to accomplish this query. I just can't extend 
my single-word query from above :-(

Can anyone give me a hint?

Thank you very much,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sort order with umlauts

2009-04-25 Thread Lukas Haase
Hello,

For a database I also need a fulltext index. This is my table containing 
all the words:

CREATE TABLE db_fulltext.fulltext(
fulltextID INTEGER PRIMARY KEY,
word VARCHAR(100) COLLATE NOCASE
);

Now I have the problem that I have also words with umlauts. Now they are 
sorted this way:

ua
..
uz
..
zz
..
üa
..

But I need the umlauts treated as their respective vovels, i.e.:

ua
üa
..
uz
..
zz

Is this somehow possible?

Greets,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Fast and simple database merge

2009-02-02 Thread Lukas Haase
Hi,

For an application I use SQLite as datafile. I have written a "compiler" 
(script chain in Linux) for creating my database file. There are 
dependencies between tables and "compiling" my single database takes 
about 1-2 hours. When there is an error I have to restart the whole 
procedure. Very bad.

In order to overcome this problem, I divided my script in small chunks 
and use "make". Each scripts takes now a few minutes and creates its own 
SQLite dat-file. When another script needs data from another file, it 
just uses "ATTACH DATABASE". Works fine.

BUT: I end up with 10 files instead of one; all of them having their 
indices. But for my application I need one file.

My question now is: Is there a simple, fast and efficient way to just 
merge these databases to a single file?

The one solution I have is to recreate all tables (CREATE TABLE) in a 
new file and use INSERT INTO ... SELECT FROM (and again using ATTACH 
DATABASE) and after that to recreate each single index.

But:
a) This takes very long
b) I have to write code for CREATE TABLE's twice
c) I have to write code for CREATE INDEX's twice

Really cool would be something like:

cat db1.dat db2.dat db3.dat > final.dat

;-)

Thank you in advance,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] onthefly Compression of SQLite database

2009-02-01 Thread Lukas Haase
Hi,

I searched a little bit with google but I could not find any code or 
extensions to compress an SQLite database on the fly. Is this possible?

I use SQLite to store a database engine with images and HTML code. I 
already use zlib to compress the data itself (i.e. the images and the 
HTML data). But my problem is that I also need a lot of meta tables and 
also a fulltext index. The fulltext index is about 50 MB itself! I can't 
compress these tables because the columns are all INTEGERS and small 
VARCHARs. In total, my SQLite database is about 100MB. Very, very huge.

If I compress it using WinZIP or gzip, I get a size of just 50MB (a 
half!) although the html data/images were already compressed!

It seems that compressing an SQLite database is very efficient. Is there 
any way to compress/decompress a whole database on the fly ?

Thank you,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Stringcompare operator with COLLATE NOCASE

2009-01-11 Thread Lukas Haase
Igor Tandetnik schrieb:
> "Lukas Haase" <lukasha...@gmx.at> wrote in
> message news:gkd3f4$cl...@ger.gmane.org
>> This is not what I want. So I use COLLATE NOCASE:
>>
>> This is the result I want. However, now I need a WHERE clause to work
>> exactly the same. So I tried:
>>
>> SELECT keyword FROM keywords WHERE keyword < 'T*' ORDER BY keyword
>> COLLATE NOCASE ASC;
> 
> SELECT keyword FROM keywords
> WHERE keyword < 'T*' COLLATE NOCASE
> ORDER BY keyword COLLATE NOCASE ASC;
> 
> Or, you could just specify the collation in the CREATE TABLE statement, 
> then it would be used by default both for comparisons and for sorting:
> 
> CREATE TABLE keywords(
> keywordID INTEGER PRIMARY KEY,
> keyword VARCHAR(100) COLLATE NOCASE
> );

Thank you very much! :-)

Luke



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Stringcompare operator with COLLATE NOCASE

2009-01-11 Thread Lukas Haase
Hello,

I have a table containting keywords:

CREATE TABLE keywords(
keywordID INTEGER PRIMARY KEY,
keyword VARCHAR(100)
);
INSERT INTO keywords VALUES(1,'Apple');
INSERT INTO keywords VALUES(2,'apple');
INSERT INTO keywords VALUES(3,'Angle');
INSERT INTO keywords VALUES(4,'Tree');
INSERT INTO keywords VALUES(5,'tee');

Normally, they would not be case sensitive, i.e. they would sort:

Angle
Apple
Tree
apple
tee

This is not what I want. So I use COLLATE NOCASE:

CREATE INDEX keyword ON keywords(keyword COLLATE NOCASE ASC);
SELECT keyword FROM keywords ORDER BY keyword COLLATE NOCASE ASC;

in order to obtain:

Angle
Apple
apple
tee
Tree

This is the result I want. However, now I need a WHERE clause to work 
exactly the same. So I tried:

SELECT keyword FROM keywords WHERE keyword < 'T*' ORDER BY keyword 
COLLATE NOCASE ASC;

Angle
Apple

But the output should be:

Angle
Apple
apple
tee

(i.e. exactly as above but just all strings smaller). It seems to me 
that the string-compare operator (<) ignores the collation from my ORDER 
BY clause.

Is there any way to obtain a valid result?

Thank you,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting the "position" (like LIMIT) for a query

2009-01-11 Thread Lukas Haase
Lukas Haase schrieb:
> Igor Tandetnik schrieb:
>> "Lukas Haase" <lukasha...@gmx.at> wrote in
>> message news:gkat07$n2...@ger.gmane.org
>>> I use an SQLite database to fill a virtual list control in Windows. In
>>> this control, I just tell the control the numer of my elements and the
>>> control tells me for which range data is needed.
> [...]
> However, I solved this, even if not so performant, according to your link.
> 
> The actual problem is the find-function: From my 1000-elements 
> resultset, I need to pass to Windows that index (between 1 and 1000!) 
> that fits best to the search criterion.

Yeah, yeah, I think I had a flash of genius right now!

Very simple, I don't know why I did not come to this earlier...

SELECT COUNT(*)
FROM keywords
WHERE keyword < '%s*' ORDER BY keyword ASC;

where %s is the string to be found

:-)

Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Getting the "position" (like LIMIT) for a query

2009-01-10 Thread Lukas Haase
Igor Tandetnik schrieb:
> "Lukas Haase" <lukasha...@gmx.at> wrote in
> message news:gkat07$n2...@ger.gmane.org
>> I use an SQLite database to fill a virtual list control in Windows. In
>> this control, I just tell the control the numer of my elements and the
>> control tells me for which range data is needed.
> 
> http://www.sqlite.org/cvstrac/wiki?p=ScrollingCursor

Thank you, I did not know this.

But unfortunately this does not help in my situation. I do not simply 
"page" up and down a fixed numer of rows.

The problem is that *Windows* tells me which indexes need to be shown.

This means: Windows just tells me: "From your 1000 elements, I want to 
have number 599". To get it a little bit better, Windows *additionally* 
tells me which elements I should cache (e.g. if possible, cache number 
450 to 500).

However, I solved this, even if not so performant, according to your link.

The actual problem is the find-function: From my 1000-elements 
resultset, I need to pass to Windows that index (between 1 and 1000!) 
that fits best to the search criterion.

And there I have exactly no clue how to do this.

If I could add a kind of auto index from 1...1000 to my resultset, I 
could try something like this:

SELECT AUTO_ID(), keyword, '%search'
FROM keywords
WHERE keyword = '%search'

but the problem is the missing AUTO_ID() functionality.

Is there any hope to successfully implement the search function?

Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Getting the "position" (like LIMIT) for a query

2009-01-10 Thread Lukas Haase
Hello,

I use an SQLite database to fill a virtual list control in Windows. In 
this control, I just tell the control the numer of my elements and the 
control tells me for which range data is needed.

For example I set:

SELECT COUNT(*) FROM keywords;

to tell the control the numer of elements. Now, I get "caching messages" 
which tell me what range will be needed next, for example iFrom=500 and 
iTo=520.

In this case I do just a:

"SELECT keyword FROM keywords ORDER BY keyword LIMIT %d, %d", iFrom, 
iTo-iFrom.

The result is stored into a std::map as a cache, and when requested, the 
appropriate element is displayed.

So far, so good.

But I have the problem that I want to be able to search in the list. In 
this case, Windows sends a message with "te" (if the string "te" should 
be found) and I need to hand back the *position* of the found data.

I have no idea how to do this. In fact, I need to know the position (the 
same position that I would give the LIMIT statement) for a specific 
result set.

I hope you can understand me. Is there a way to implement this?

Thank you very much!
Luke

PS: The data may also be composed of a more complex statement (with JOINs).

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] 600ms for simple query: How to optimize it?

2009-01-10 Thread Lukas Haase
Kees Nuyt schrieb:
> On Fri, 09 Jan 2009 21:16:03 +0100, Lukas Haase
> <lukasha...@gmx.at> wrote in General Discussion of SQLite
> Database <sqlite-users@sqlite.org>:
> 
>> Hello Richard!
>>
>> Thank you very much!! It works! :-)
>>
>>
>> Indeed. 0-10 milliseconds instead of 500-800 :-)
>>
>> But may you tell me why this works and where you have this information? 
>> I know the O-notation but I do not know /why/ this boosts down to log(n)...
> 
> Use EXPLAIN SELECT .
>   to see the virtual machine instructions
> and EXPLAIN SELECT QUERY PLAN .
>   to see which index is used.
> 
> http://www.sqlite.org/lang_explain.html
> 
> Each JOIN is implemented as nested loops. The virtual
> machine code can tell a lot about what part of the database
> has to be scanned.

Thank you, I know this and I did try it already.

But unfortunately I do not know how to interpret the results.

Best regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] 600ms for simple query: How to optimize it?

2009-01-10 Thread Lukas Haase
D. Richard Hipp schrieb:
> On Jan 9, 2009, at 3:16 PM, Lukas Haase wrote:
>> SELECT t.topic, t.length
>> FROM printgroup AS pg1
>> LEFT JOIN printgroup AS pg2 ON pg1.printgroup = pg2.printgroup
>> LEFT JOIN topics AS t ON t.topicID = pg2.topicID
>> LEFT JOIN topic_ids AS ti ON ti.topicID = pg1.topicID
>> WHERE ti.topic_textID = ''
>> ORDER BY pg2.topicID ASC;
> 
> You seem very fond of using LEFT JOINs in places where they do not  
> make good sense.

Yes, I started with mySQL 3 many years ago. At the beginning I only knew 
about LEFT JOINs and used them. Now I think I also know the other types 
of JOINs but I still use LEFT JOINs very often, just by habit. And with 
mySQL I never had performance problems with them.

> What is it that you think a LEFT JOIN does?

(A LEFT JOIN B) joins together table A and B while all records are taken 
  from A and only records that match both are takes from B. If a record 
from A has no corresponding data in B, the values are NULL.

> How is  
> a LEFT JOIN different than an ordinary inner JOIN?

INNER JOIN takes *all* records from both tables, A and B. Generally, the 
resultset will be larger.

> I ask because I  
> suspect that your answer will reveal misconceptions about LEFT JOINs  
> which, when rectified, will cause most of your performance issues to  
> go away.

Maybe my I think too much in "left joining" but I did not know that 
there is so much difference in performance.

Best Regards,
Luke


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] 600ms for simple query: How to optimize it?

2009-01-09 Thread Lukas Haase
Hello Richard!

Thank you very much!! It works! :-)

D. Richard Hipp schrieb:
> On Jan 7, 2009, at 6:11 PM, Lukas Haase wrote:
> 
>> Hello,
>>
>> Can somebody tell me why this (simple) query take so much time? This
>> query does nothing more than querying a table and JOINing two other
>> tables together.
>>
>> SELECT
>>  ti1.topicID AS topicID,
>>  ti2.topic_textID AS parent,
>>  n.level,
>>  n.level_order
>> FROM navigation AS n
>> LEFT JOIN topic_ids AS ti1 ON ti1.topicID = n.topicID
>> LEFT JOIN topic_ids AS ti2 ON ti2.topicID = n.parent_topicID
>> WHERE ti1.topic_textID = 'X';
> 
> SQLite should be running this query in O(NlogN).
> 
> If you change the first LEFT JOIN to a plain old JOIN (which should  
> give equivalent results by virtue of the WHERE clause restricting  
> ti1.topic_textID to not be NULL) then it should run in O(logN) - much  
> faster.  Try it and let me know.

Indeed. 0-10 milliseconds instead of 500-800 :-)

But may you tell me why this works and where you have this information? 
I know the O-notation but I do not know /why/ this boosts down to log(n)...

I have other queries which worry me. But that trick did not help in 
these cases :-(

Especially I have problems with a self-join. In a table I have defined 
groups of elements ("printgroup"):

CREATE TABLE printgroup(
topicID INTEGER,
printgroup INTEGER,
PRIMARY KEY(topicID, printgroup)
);

I think these indices are not necessary because both fields are primary 
keys anyway.
CREATE INDEX topicID ON printgroup(topicID);
CREATE INDEX pprintgroup ON printgroup(printgroup);

When I know one element of a group (given by topicID) I want to find all 
other elements in the same group:

SELECT t.topic, t.length
FROM printgroup AS pg1
LEFT JOIN printgroup AS pg2 ON pg1.printgroup = pg2.printgroup
LEFT JOIN topics AS t ON t.topicID = pg2.topicID
LEFT JOIN topic_ids AS ti ON ti.topicID = pg1.topicID
WHERE ti.topic_textID = ''
ORDER BY pg2.topicID ASC;

The table "topics" just contains the actual data for each topicID 
(t.topic with length t.length).

This query takes a few seconds (und to minutes) with "sqlite3.exe" and 
even much longer in my application (sqlite with CppSQlite3): Up to 15 
minutes!

Mimicking your magic above I tried to leave out the "LEFT" in the 
self-joins but it did not change anything :-(

And unfortunately, the optimization FAQ [1] is very incomplete, at least 
at the interesting points (indices) :-(


Thank you again and best regards,
Luke


[1] http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] 600ms for simple query: How to optimize it?

2009-01-07 Thread Lukas Haase
Hello,

Can somebody tell me why this (simple) query take so much time? This 
query does nothing more than querying a table and JOINing two other 
tables together.

SELECT
ti1.topicID AS topicID,
ti2.topic_textID AS parent,
n.level,
n.level_order
FROM navigation AS n
LEFT JOIN topic_ids AS ti1 ON ti1.topicID = n.topicID
LEFT JOIN topic_ids AS ti2 ON ti2.topicID = n.parent_topicID
WHERE ti1.topic_textID = 'X';

I thought I optimized the table good with indexes but one such a query 
takes 500 to 1000ms in my C++ program.

Here are my table definitions and the indexes (unfortunately I need the 
VARCHAR(20) field because I get the "topicID" only as text:

CREATE TABLE topic_ids(
topicID INTEGER,
topic_textID VARCHAR(20),
PRIMARY KEY(topicID)
);
CREATE INDEX topic_textID ON topic_ids(topic_textID);

CREATE TABLE navigation(
topicID INTEGER PRIMARY KEY,
parent_topicID INTEGER,
level VARCHAR(20),
level_order INTEGER
);
CREATE INDEX parent_topicID ON navigation(parent_topicID);
CREATE INDEX level ON navigation(level);
CREATE INDEX level_order ON navigation(level_order);

I need to execute this query in a database application each time a new 
page is opened. So 500ms are really too much. A few ms would be great.

And the tables itself are not really huge:

SELECT COUNT(*) FROM navigation;
19469
SELECT COUNT(*) FROM topic_ids;
19469

Does anybody have an idea what's going wrong here? How can I speed up 
this query?

Thank you very much in advance,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users