Re: [Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-10 Thread Dieter Maurer

Morten W. Petersen writes:
  It seems I misunderstood the term fuzzy logic myself.  Fuzzy logic means
  if I search for a word, for example 'programmer', it will return matches
  to the words 'program', 'programming','programmable' etc.
This, usually, is called "stemming".
Though, your examples indicate quite a strong form of it.

If you have some tool, maybe LinguistX, that map from a word
to its stem and then from the stem to all words with this as
stem (or directly give the stem equivalence class of a word),
then it is quite easy to incorporate that in Zope's catalog.

However, to do that cleanly, you will need good algorithms
and/or large dictionaries. This, usually, is not free of
charge.



Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-10 Thread Casey Duncan

--- "Morten W. Petersen" [EMAIL PROTECTED] wrote:
[snip]
 
 It seems I misunderstood the term fuzzy logic
 myself.  Fuzzy logic means
 if I search for a word, for example 'programmer', it
 will return matches
 to the words 'program', 'programming','programmable'
 etc.
 
 I.e., it will somewhat intelligently return words
 that are similar in
 what they mean, using grammar rules (chopping off
 endings of words and
 making them match others).
 
 Hmm.
 
 Cheers,
 
 Morten
 

ZCatalog TextIndexes support this type of "wildcard"
searching. I posted a message a couple of weeks ago
that describes the query syntax. Search the mailing
list archives for it.


=
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`-

__
Do You Yahoo!?
Yahoo! Photos - Share your holiday photos online!
http://photos.yahoo.com/

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-10 Thread Ken Manheimer

On Wed, 10 Jan 2001, Morten W. Petersen wrote:

  I do not think that "fuzzy logic" is strongly related to "regexp-like".
  Anyway.
  
  Fuzzy searching often means "finding matches with characters omitted,
  replaced or inserted".
 
 It seems I misunderstood the term fuzzy logic myself.  Fuzzy logic means
 if I search for a word, for example 'programmer', it will return matches
 to the words 'program', 'programming','programmable' etc.

I think your talking about something else.  Last i checked, "fuzzy logic"
was a logical algebra based on the existence of intermediate truth states,
between "true" and "false".  It has little or nothing to do with
aproximate searching, though i guess you could use it to make assertions
about the aproximations.  I think what you all are talking about is "fuzzy
matching".

 I.e., it will somewhat intelligently return words that are similar in
 what they mean, using grammar rules (chopping off endings of words and
 making them match others).

There are also matching mechanisms like soundex, that account for
misspelling by translating words to phonetic-equivalent normalized codes,
and comparing on that basis.

Ken
[EMAIL PROTECTED]


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-09 Thread Steve Alexander

Morten W. Petersen wrote:

 Is there anyone who could try to give an estimate of how long it would
 take to add fuzzy logic (regexp-like) searching capability to the
 ZCatalog?
 
 And reasoning as to why would be appreciated. ;)


Right now, you could use an External Method to apply a regex match to 
each unique value in a field index in a Catalog, and return the 
appropriate Catalog Brains for each match.

This is as easy as called uniqueValues() on the catalog, iterating 
through the unique values to filter them, and then searching the catalog 
with the results of the filter as the constraint for that fieldindex. 
This would minutes and hours to implement and test, and would execute in 
O(number of unique field values) time,  for many values of the 
fieldindex, which should remain acceptably fast where you have a catalog 
with many items, most of which have fields drawn from the same (small) set.

If you want to search a TextIndex using a regex, or you want to search 
for a pattern among a number of fields of the same item, then you're 
into an algorithm that would execute in O(number of cataloged items) 
time. That could get very slow for any sizable catalog.

The other option for searching a TextIndex is to use extensions to the 
NEAR and AND and OR operators that are currently supported. I guess it 
all depends what you mean by "fuzzy matching".

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-09 Thread Dieter Maurer

Morten W. Petersen writes:
  Is there anyone who could try to give an estimate of how long it would
  take to add fuzzy logic (regexp-like) searching capability to the
  ZCatalog?
I do not think that "fuzzy logic" is strongly related to "regexp-like".
Anyway.

Fuzzy searching often means "finding matches with characters omitted,
replaced or inserted".

Zope's globbing vocabularies support wildcards '*' and '?'.
To implement wildcard based searches efficiently, they
index words under their two letter consitutents.
When you now get a pattern, you derive from the pattern
what two letter constituents the matching words must
have and retrieve them. This defines a candidate word set.
Then you check, whether the retrieved word really match
the expression.

You can extend this algorithm to get fuzzy searches.



Dieter

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




[Zope-dev] ZCatalog and 'fuzzy logic'

2001-01-09 Thread Morten W. Petersen

Is there anyone who could try to give an estimate of how long it would
take to add fuzzy logic (regexp-like) searching capability to the
ZCatalog?

And reasoning as to why would be appreciated. ;)

-Morten


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )