[sqlite] Re : [sqlite] Soft search in database
Hello John, a page rank like algorithm does not make sense with only a bunch of text files. Its power comes from its hability to take into account the matrix of links between documents on the web. In this case, a classic TFIDF http://en.wikipedia.org/wiki/Tf-idf algorithm should be sufficient. Pierre - Message d'origine De : John Stanton <[EMAIL PROTECTED]> À : sqlite-users@sqlite.org Envoyé le : Mardi, 6 Mars 2007, 17h22mn 08s Objet : Re: [sqlite] Soft search in database Look up "page rank algorithm", in particular the papers by Brin and Page, the Google founders. Henrik Ræder wrote: > Hi > > (First post - hope it's an appropriate place) > > I've been implementing a database of a few MB of text (indexing > magazines) in SQLite, and so far have found it to work really well. > > Now my boss, who has a wonderfully creative mind, asks me to implement a > full-text search function which is not the usual simplistic 'found' / > 'not found', but more Google-style where a graded list of results is > returned. > > For example, in a search for "MP3 Player", results with the phrases next > to each other would get a high rating, as would records with a high > occurance of the keywords. > > This falls outside the usual scope of SQL, but would still seem a > relatively common problem to tackle. > > Any ideas (pointers) how to tackle this? > > Best regards > > Henrik Ræder Clausen > CD-rom editor > Komputer for alle > > Jidoka Development Hougårdsvej 29 8220 Brabrand DenmarkTlf +45 > 2611 5842 > - To unsubscribe, send email to [EMAIL PROTECTED] - ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses http://fr.answers.yahoo.com
[sqlite] Re : [sqlite] Re : [sqlite] Soft search in database
Hello Jos, not as is. You need to modify slighlty the library. Half a day of work I guess. Pierre - Message d'origine De : Jos van den Oever <[EMAIL PROTECTED]> À : sqlite-users@sqlite.org Envoyé le : Mardi, 6 Mars 2007, 16h33mn 15s Objet : Re: [sqlite] Re : [sqlite] Soft search in database 2007/3/6, Pierre Aubert <[EMAIL PROTECTED]>: > You can also use ft3.sourceforge.net Does this also allow having an inverted index without actually storing the files in the database? Cheers, Jos - To unsubscribe, send email to [EMAIL PROTECTED] - ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses http://fr.answers.yahoo.com
Re: AW: [sqlite] Soft search in database
I built something like that where each word was translated into a token and a key built from the token and the position of the word and used to build a tree. The tree access was fast and could probably be adapted to produce strict ranking by position. The complexity of the method is the need for a dictionary to use for conversion from word to token. Martin Pfeifle wrote: Unfortunately, the fts module of sqlite does not support "fuzzy text search = google search". What you first need is a similarity measure between strings, e.g. the Edit-distance. Based on such a similarity measure, you could build up an appropriate index structure, e.g. a Relational M-tree (cf. deposit.ddb.de/cgi-bin/dokserv?idn=972667849&dok_var=d1&dok_ext=pdf&filename=972667849.pdf Chapter 10.3) Such a module should not only support range queries, e.g. give me all strings which have a distance smaller than eps to my query string, but also ranked nearest neighbor queries. We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. Best Martin - Ursprüngliche Mail Von: Michael Schlenker <[EMAIL PROTECTED]> An: sqlite-users@sqlite.org Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr Betreff: Re: [sqlite] Soft search in database Henrik Ræder schrieb: Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Michael - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Soft search in database
Look up "page rank algorithm", in particular the papers by Brin and Page, the Google founders. Henrik Ræder wrote: Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? Best regards Henrik Ræder Clausen CD-rom editor Komputer for alle Jidoka Development Hougårdsvej 29 8220 Brabrand DenmarkTlf +45 2611 5842 - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Re : [sqlite] Soft search in database
2007/3/6, Pierre Aubert <[EMAIL PROTECTED]>: You can also use ft3.sourceforge.net Does this also allow having an inverted index without actually storing the files in the database? Cheers, Jos - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Re : [sqlite] Soft search in database
You can also use ft3.sourceforge.net Pierre - Message d'origine De : Henrik Ræder <[EMAIL PROTECTED]> À : sqlite-users@sqlite.org Envoyé le : Mardi, 6 Mars 2007, 9h22mn 33s Objet : [sqlite] Soft search in database Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? Best regards Henrik Ræder Clausen CD-rom editor Komputer for alle Jidoka Development Hougårdsvej 29 8220 Brabrand DenmarkTlf +45 2611 5842 ___ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses http://fr.answers.yahoo.com
Re: [sqlite] Soft search in database
Hi Martin 2007/3/6, Martin Pfeifle <[EMAIL PROTECTED]>: Unfortunately, the fts module of sqlite does not support "fuzzy text search = google search". Yes, I realize this. I'll have to add some additional logic to achieve this. Looks doable. Such a module should not only support range queries, e.g. give me all strings which have a distance smaller than eps to my query string, but also ranked nearest neighbor queries. We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. While it would be cool to work on solving the general case, I'm afraid the time pressure here doesn't really permit me doing anything major in that direction. We have a controlled environment for our searches and will be able to solve the job sufficiently well by simpler means. Good luck! -Henrik
Re: [sqlite] Soft search in database
Hi 2007/3/6, Michael Schlenker <[EMAIL PROTECTED]>: > Now my boss, who has a wonderfully creative mind, asks me to implement a > full-text search function which is not the usual simplistic 'found' / > 'not found', but more Google-style where a graded list of results is > returned. You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Thanks a bunch. These are good building blocks for what I want to do. -H
AW: [sqlite] Soft search in database
Unfortunately, the fts module of sqlite does not support "fuzzy text search = google search". What you first need is a similarity measure between strings, e.g. the Edit-distance. Based on such a similarity measure, you could build up an appropriate index structure, e.g. a Relational M-tree (cf. deposit.ddb.de/cgi-bin/dokserv?idn=972667849&dok_var=d1&dok_ext=pdf&filename=972667849.pdf Chapter 10.3) Such a module should not only support range queries, e.g. give me all strings which have a distance smaller than eps to my query string, but also ranked nearest neighbor queries. We also urgently need such a module, and think about implementing it on our own. I would appreciate if efforts could be synchronized. Best Martin - Ursprüngliche Mail Von: Michael Schlenker <[EMAIL PROTECTED]> An: sqlite-users@sqlite.org Gesendet: Dienstag, den 6. März 2007, 09:46:52 Uhr Betreff: Re: [sqlite] Soft search in database Henrik Ræder schrieb: > Hi > > (First post - hope it's an appropriate place) > > I've been implementing a database of a few MB of text (indexing > magazines) in SQLite, and so far have found it to work really well. > > Now my boss, who has a wonderfully creative mind, asks me to implement a > full-text search function which is not the usual simplistic 'found' / > 'not found', but more Google-style where a graded list of results is > returned. > > For example, in a search for "MP3 Player", results with the phrases next > to each other would get a high rating, as would records with a high > occurance of the keywords. > > This falls outside the usual scope of SQL, but would still seem a > relatively common problem to tackle. > > Any ideas (pointers) how to tackle this? You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Michael -- Michael Schlenker Software Engineer CONTACT Software GmbH Tel.: +49 (421) 20153-80 Wiener Straße 1-3 Fax:+49 (421) 20153-41 28359 Bremen http://www.contact.de/ E-Mail: [EMAIL PROTECTED] Sitz der Gesellschaft: Bremen | Geschäftsführer: Karl Heinz Zachries Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215 - To unsubscribe, send email to [EMAIL PROTECTED] - ___ Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de
Re: [sqlite] Soft search in database
Henrik Ræder schrieb: Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? You have come to the right place. Take a closer look at: http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex Michael -- Michael Schlenker Software Engineer CONTACT Software GmbH Tel.: +49 (421) 20153-80 Wiener Straße 1-3 Fax:+49 (421) 20153-41 28359 Bremen http://www.contact.de/ E-Mail: [EMAIL PROTECTED] Sitz der Gesellschaft: Bremen | Geschäftsführer: Karl Heinz Zachries Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215 - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Soft search in database
Hi (First post - hope it's an appropriate place) I've been implementing a database of a few MB of text (indexing magazines) in SQLite, and so far have found it to work really well. Now my boss, who has a wonderfully creative mind, asks me to implement a full-text search function which is not the usual simplistic 'found' / 'not found', but more Google-style where a graded list of results is returned. For example, in a search for "MP3 Player", results with the phrases next to each other would get a high rating, as would records with a high occurance of the keywords. This falls outside the usual scope of SQL, but would still seem a relatively common problem to tackle. Any ideas (pointers) how to tackle this? Best regards Henrik Ræder Clausen CD-rom editor Komputer for alle Jidoka Development Hougårdsvej 29 8220 Brabrand DenmarkTlf +45 2611 5842