Re: [Zope-dev] ZCatalog and 'fuzzy logic'
Morten W. Petersen writes: It seems I misunderstood the term fuzzy logic myself. Fuzzy logic means if I search for a word, for example 'programmer', it will return matches to the words 'program', 'programming','programmable' etc. This, usually, is called "stemming". Though, your examples indicate quite a strong form of it. If you have some tool, maybe LinguistX, that map from a word to its stem and then from the stem to all words with this as stem (or directly give the stem equivalence class of a word), then it is quite easy to incorporate that in Zope's catalog. However, to do that cleanly, you will need good algorithms and/or large dictionaries. This, usually, is not free of charge. Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog and 'fuzzy logic'
--- "Morten W. Petersen" [EMAIL PROTECTED] wrote: [snip] It seems I misunderstood the term fuzzy logic myself. Fuzzy logic means if I search for a word, for example 'programmer', it will return matches to the words 'program', 'programming','programmable' etc. I.e., it will somewhat intelligently return words that are similar in what they mean, using grammar rules (chopping off endings of words and making them match others). Hmm. Cheers, Morten ZCatalog TextIndexes support this type of "wildcard" searching. I posted a message a couple of weeks ago that describes the query syntax. Search the mailing list archives for it. = | Casey Duncan | Kaivo, Inc. | [EMAIL PROTECTED] `- __ Do You Yahoo!? Yahoo! Photos - Share your holiday photos online! http://photos.yahoo.com/ ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog and 'fuzzy logic'
On Wed, 10 Jan 2001, Morten W. Petersen wrote: I do not think that "fuzzy logic" is strongly related to "regexp-like". Anyway. Fuzzy searching often means "finding matches with characters omitted, replaced or inserted". It seems I misunderstood the term fuzzy logic myself. Fuzzy logic means if I search for a word, for example 'programmer', it will return matches to the words 'program', 'programming','programmable' etc. I think your talking about something else. Last i checked, "fuzzy logic" was a logical algebra based on the existence of intermediate truth states, between "true" and "false". It has little or nothing to do with aproximate searching, though i guess you could use it to make assertions about the aproximations. I think what you all are talking about is "fuzzy matching". I.e., it will somewhat intelligently return words that are similar in what they mean, using grammar rules (chopping off endings of words and making them match others). There are also matching mechanisms like soundex, that account for misspelling by translating words to phonetic-equivalent normalized codes, and comparing on that basis. Ken [EMAIL PROTECTED] ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog and 'fuzzy logic'
Morten W. Petersen wrote: Is there anyone who could try to give an estimate of how long it would take to add fuzzy logic (regexp-like) searching capability to the ZCatalog? And reasoning as to why would be appreciated. ;) Right now, you could use an External Method to apply a regex match to each unique value in a field index in a Catalog, and return the appropriate Catalog Brains for each match. This is as easy as called uniqueValues() on the catalog, iterating through the unique values to filter them, and then searching the catalog with the results of the filter as the constraint for that fieldindex. This would minutes and hours to implement and test, and would execute in O(number of unique field values) time, for many values of the fieldindex, which should remain acceptably fast where you have a catalog with many items, most of which have fields drawn from the same (small) set. If you want to search a TextIndex using a regex, or you want to search for a pattern among a number of fields of the same item, then you're into an algorithm that would execute in O(number of cataloged items) time. That could get very slow for any sizable catalog. The other option for searching a TextIndex is to use extensions to the NEAR and AND and OR operators that are currently supported. I guess it all depends what you mean by "fuzzy matching". -- Steve Alexander Software Engineer Cat-Box limited http://www.cat-box.net ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZCatalog and 'fuzzy logic'
Morten W. Petersen writes: Is there anyone who could try to give an estimate of how long it would take to add fuzzy logic (regexp-like) searching capability to the ZCatalog? I do not think that "fuzzy logic" is strongly related to "regexp-like". Anyway. Fuzzy searching often means "finding matches with characters omitted, replaced or inserted". Zope's globbing vocabularies support wildcards '*' and '?'. To implement wildcard based searches efficiently, they index words under their two letter consitutents. When you now get a pattern, you derive from the pattern what two letter constituents the matching words must have and retrieve them. This defines a candidate word set. Then you check, whether the retrieved word really match the expression. You can extend this algorithm to get fuzzy searches. Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
[Zope-dev] ZCatalog and 'fuzzy logic'
Is there anyone who could try to give an estimate of how long it would take to add fuzzy logic (regexp-like) searching capability to the ZCatalog? And reasoning as to why would be appreciated. ;) -Morten ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )