[sqlite] Re : [sqlite] Soft search in database

2007-03-07 Thread Pierre Aubert
Hello John,
a page rank like algorithm does not make sense with only a bunch of text files. 
Its power
comes from its hability to take into account the matrix of links between 
documents on the web.
In this case, a classic TFIDF http://en.wikipedia.org/wiki/Tf-idf algorithm 
should be sufficient.
Pierre

- Message d'origine 
De : John Stanton <[EMAIL PROTECTED]>
À : sqlite-users@sqlite.org
Envoyé le : Mardi, 6 Mars 2007, 17h22mn 08s
Objet : Re: [sqlite] Soft search in database

Look up "page rank algorithm", in particular the papers by Brin and 
Page, the Google founders.

Henrik Ræder wrote:
>   Hi
> 
>   (First post - hope it's an appropriate place)
> 
>   I've been implementing a database of a few MB of text (indexing
> magazines) in SQLite, and so far have found it to work really well.
> 
>   Now my boss, who has a wonderfully creative mind, asks me to implement a
> full-text search function which is not the usual simplistic 'found' /
> 'not found', but more Google-style where a graded list of results is 
> returned.
> 
>   For example, in a search for "MP3 Player", results with the phrases next
> to each other would get a high rating, as would records with a high
> occurance of the keywords.
> 
>   This falls outside the usual scope of SQL, but would still seem a
> relatively common problem to tackle.
> 
>   Any ideas (pointers) how to tackle this?
> 
>   Best regards
> 
> Henrik Ræder Clausen
> CD-rom editor
> Komputer for alle
> 
> Jidoka Development   Hougårdsvej 29   8220 Brabrand   DenmarkTlf +45
> 2611 5842
> 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-












___ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur 
Yahoo! Questions/Réponses 
http://fr.answers.yahoo.com

[sqlite] Re : [sqlite] Re : [sqlite] Soft search in database

2007-03-07 Thread Pierre Aubert
Hello Jos,
not as is. You need to modify slighlty the library. Half a day of work I guess.
Pierre

- Message d'origine 
De : Jos van den Oever <[EMAIL PROTECTED]>
À : sqlite-users@sqlite.org
Envoyé le : Mardi, 6 Mars 2007, 16h33mn 15s
Objet : Re: [sqlite] Re : [sqlite] Soft search in database

2007/3/6, Pierre Aubert <[EMAIL PROTECTED]>:
> You can also use ft3.sourceforge.net

Does this also allow having an inverted index without actually storing
the files in the database?

Cheers,
Jos

-
To unsubscribe, send email to [EMAIL PROTECTED]
-












___ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur 
Yahoo! Questions/Réponses 
http://fr.answers.yahoo.com

Re: AW: [sqlite] Soft search in database

2007-03-06 Thread John Stanton
I built something like that where each word was translated into a token 
and a key built from the token and the position of the word and used to 
build a tree.  The tree access was fast and could probably be adapted to 
produce strict ranking by position.  The complexity of the method is the 
need for a dictionary to use for conversion from word to token.


Martin Pfeifle wrote:

Unfortunately, the fts module of sqlite does not support "fuzzy text search = google 
search".
What you first need is a similarity measure between strings, e.g. the 
Edit-distance.
Based on such a similarity measure, you could build up an appropriate index 
structure,
e.g.  a Relational M-tree (cf. 
deposit.ddb.de/cgi-bin/dokserv?idn=972667849&dok_var=d1&dok_ext=pdf&filename=972667849.pdf
 Chapter 10.3)
Such a module should not only support range queries, e.g. give me all strings 
which have a distance smaller than eps to my query string, but also ranked 
nearest neighbor queries.
 
We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. 
Best Martin



- Ursprüngliche Mail 
Von: Michael Schlenker <[EMAIL PROTECTED]>
An: sqlite-users@sqlite.org
Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr
Betreff: Re: [sqlite] Soft search in database


Henrik Ræder schrieb:


 Hi

 (First post - hope it's an appropriate place)

 I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.

 Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is 
returned.


 For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.

 This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.

 Any ideas (pointers) how to tackle this?


You have come to the right place.

Take a closer look at:
http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex

Michael




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Soft search in database

2007-03-06 Thread John Stanton
Look up "page rank algorithm", in particular the papers by Brin and 
Page, the Google founders.


Henrik Ræder wrote:

  Hi

  (First post - hope it's an appropriate place)

  I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.

  Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is 
returned.


  For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.

  This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.

  Any ideas (pointers) how to tackle this?

  Best regards

Henrik Ræder Clausen
CD-rom editor
Komputer for alle

Jidoka Development   Hougårdsvej 29   8220 Brabrand   DenmarkTlf +45
2611 5842




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Re : [sqlite] Soft search in database

2007-03-06 Thread Jos van den Oever

2007/3/6, Pierre Aubert <[EMAIL PROTECTED]>:

You can also use ft3.sourceforge.net


Does this also allow having an inverted index without actually storing
the files in the database?

Cheers,
Jos

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Re : [sqlite] Soft search in database

2007-03-06 Thread Pierre Aubert
You can also use ft3.sourceforge.net
Pierre

- Message d'origine 
De : Henrik Ræder <[EMAIL PROTECTED]>
À : sqlite-users@sqlite.org
Envoyé le : Mardi, 6 Mars 2007, 9h22mn 33s
Objet : [sqlite] Soft search in database

   Hi

   (First post - hope it's an appropriate place)

   I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.

   Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is returned.

   For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.

   This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.

   Any ideas (pointers) how to tackle this?

   Best regards

Henrik Ræder Clausen
CD-rom editor
Komputer for alle

Jidoka Development   Hougårdsvej 29   8220 Brabrand   DenmarkTlf +45
2611 5842











___ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur 
Yahoo! Questions/Réponses 
http://fr.answers.yahoo.com

Re: [sqlite] Soft search in database

2007-03-06 Thread Henrik Ræder

  Hi Martin

2007/3/6, Martin Pfeifle <[EMAIL PROTECTED]>:


Unfortunately, the fts module of sqlite does not support "fuzzy text
search = google search".



  Yes, I realize this. I'll have to add some additional logic to
achieve this. Looks doable.

Such a module should not only support range queries, e.g. give me all

strings which have a distance smaller than eps to my query string, but also
ranked nearest neighbor queries.

We also urgently need such a module, and think about implementing it on
our own. I would appreciate if efforts could be synchronized.



  While it would be cool to work on solving the general case, I'm
afraid the time pressure here doesn't really permit me doing anything
major in that direction. We have a controlled environment for our
searches and will be able to solve the job sufficiently well by
simpler means.

  Good luck!

  -Henrik


Re: [sqlite] Soft search in database

2007-03-06 Thread Henrik Ræder

  Hi

2007/3/6, Michael Schlenker <[EMAIL PROTECTED]>:


>   Now my boss, who has a wonderfully creative mind, asks me to implement
a
> full-text search function which is not the usual simplistic 'found' /
> 'not found', but more Google-style where a graded list of results is
> returned.
You have come to the right place.

Take a closer look at:
http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex



  Thanks a bunch. These are good building blocks for what I want to do.

  -H


AW: [sqlite] Soft search in database

2007-03-06 Thread Martin Pfeifle
Unfortunately, the fts module of sqlite does not support "fuzzy text search = 
google search".
What you first need is a similarity measure between strings, e.g. the 
Edit-distance.
Based on such a similarity measure, you could build up an appropriate index 
structure,
e.g.  a Relational M-tree (cf. 
deposit.ddb.de/cgi-bin/dokserv?idn=972667849&dok_var=d1&dok_ext=pdf&filename=972667849.pdf
 Chapter 10.3)
Such a module should not only support range queries, e.g. give me all strings 
which have a distance smaller than eps to my query string, but also ranked 
nearest neighbor queries.
 
We also urgently need such a module, and think about implementing it on our 
own. I would appreciate if efforts could be synchronized. 
Best Martin


- Ursprüngliche Mail 
Von: Michael Schlenker <[EMAIL PROTECTED]>
An: sqlite-users@sqlite.org
Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr
Betreff: Re: [sqlite] Soft search in database


Henrik Ræder schrieb:
>   Hi
> 
>   (First post - hope it's an appropriate place)
> 
>   I've been implementing a database of a few MB of text (indexing
> magazines) in SQLite, and so far have found it to work really well.
> 
>   Now my boss, who has a wonderfully creative mind, asks me to implement a
> full-text search function which is not the usual simplistic 'found' /
> 'not found', but more Google-style where a graded list of results is 
> returned.
> 
>   For example, in a search for "MP3 Player", results with the phrases next
> to each other would get a high rating, as would records with a high
> occurance of the keywords.
> 
>   This falls outside the usual scope of SQL, but would still seem a
> relatively common problem to tackle.
> 
>   Any ideas (pointers) how to tackle this?
You have come to the right place.

Take a closer look at:
http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex

Michael

-- 
Michael Schlenker
Software Engineer

CONTACT Software GmbH   Tel.:   +49 (421) 20153-80
Wiener Straße 1-3   Fax:+49 (421) 20153-41
28359 Bremen
http://www.contact.de/  E-Mail: [EMAIL PROTECTED]

Sitz der Gesellschaft: Bremen | Geschäftsführer: Karl Heinz Zachries
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



___ 
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de

Re: [sqlite] Soft search in database

2007-03-06 Thread Michael Schlenker

Henrik Ræder schrieb:

  Hi

  (First post - hope it's an appropriate place)

  I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.

  Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is 
returned.


  For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.

  This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.

  Any ideas (pointers) how to tackle this?

You have come to the right place.

Take a closer look at:
http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex

Michael

--
Michael Schlenker
Software Engineer

CONTACT Software GmbH   Tel.:   +49 (421) 20153-80
Wiener Straße 1-3   Fax:+49 (421) 20153-41
28359 Bremen
http://www.contact.de/  E-Mail: [EMAIL PROTECTED]

Sitz der Gesellschaft: Bremen | Geschäftsführer: Karl Heinz Zachries
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Soft search in database

2007-03-06 Thread Henrik Ræder

  Hi

  (First post - hope it's an appropriate place)

  I've been implementing a database of a few MB of text (indexing
magazines) in SQLite, and so far have found it to work really well.

  Now my boss, who has a wonderfully creative mind, asks me to implement a
full-text search function which is not the usual simplistic 'found' /
'not found', but more Google-style where a graded list of results is returned.

  For example, in a search for "MP3 Player", results with the phrases next
to each other would get a high rating, as would records with a high
occurance of the keywords.

  This falls outside the usual scope of SQL, but would still seem a
relatively common problem to tackle.

  Any ideas (pointers) how to tackle this?

  Best regards

Henrik Ræder Clausen
CD-rom editor
Komputer for alle

Jidoka Development   Hougårdsvej 29   8220 Brabrand   DenmarkTlf +45
2611 5842