I'll try with the one you suggested, thanks for the clarifications!

/Henrik

On Tue, Nov 3, 2009 at 8:38 AM, Alexander Burger <a...@software-lab.de> wrot=
e:
> Hi Henrik,
>
>> I took a look at the pilog file, I already get what same and range are
>> doing but what are part, head and fold doing?
>
> You are on the right track. You used 'tolr', but this actually makes
> sense only in combination with the '+Sn' (Soundex) prefix. The whole
> matter is rather complicated, because there are so many combinations of
> index types and Pilog comparison functions possible.
>
>
> I would say that we have the following typical use cases for string
> searches (I'll leave out numerical searches, which usually combine with
> 'same' or 'range').
>
> 1. "Exact" searches. You have either a unique index
>
> =A0 =A0 =A0(rel key (+Key +String))
>
> =A0 or a non-unique index
>
> =A0 =A0 =A0(rel key (+Ref +String))
>
> =A0 and you can compare results in Pilog with
>
> =A0 =A0 =A0(same @Str @Cls key)
>
> =A0 for exact matches, or with
>
> =A0 =A0 =A0(head @Str @Cls key)
>
> =A0 for "dictionary" searches (searching only for the beginning of
> =A0 strings). These are case-sensitive searches.
>
>
> 2. "Folded" searches. They make use of the 'fold' function which keeps
> =A0 only letters, converted to lower case, and digits.
>
> =A0 =A0 =A0(rel key (+Fold +Ref +String))
> =A0 =A0 =A0...
> =A0 =A0 =A0(fold @Str @Cls key)
>
> =A0 This searches only for the beginning of strings. We use it typically
> =A0 for telephone numbers.
>
>
> =A0 If a search for individual words in a key is desired, we can use
>
> =A0 =A0 =A0(rel key (+List +Fold +Ref +String))
> =A0 =A0 =A0...
> =A0 =A0 =A0(fold @Str @Cls key)
>
> =A0 This stores only the strings in the list (not the substrings) in
> =A0 'fold'ed representation. So each word can be found by "dictionary"
> =A0 search. This requires changes to the GUI and import functions,
> =A0 though, as 'key' is not a string but a list of strings.
>
>
> =A0 Finally, we can also index folded substrings:
>
> =A0 =A0 =A0(rel key (+Fold +Idx +String))
> =A0 =A0 =A0...
> =A0 =A0 =A0(part @Str @Cls key)
>
> =A0 This is perhaps what you need. If you go for it, I'd recommend you
> =A0 download once more the latest testing release, as the 'part' function
> =A0 was changed recently.
>
>
> 3. "Tolerant" searches. They return first all exact (case-sensitive)
> =A0 matches of partial strings, and then the matches according to the
> =A0 soundex algorithm (the first letter is compared exactly
> =A0 (case-sensitive), the rest checks for similarity). This makes mainly
> =A0 sense for personal names.
>
> =A0 =A0 =A0(rel key (+Sn +Idx +String))
> =A0 =A0 =A0...
> =A0 =A0 =A0(tolr @Str @Cls key)
>
>
> Concerning space consumption, the '+Key' and '+Ref' indexes are the most
> economical ones. They create only a single entry in the index tree per
> key.
>
> Then follow the '+List +Ref +String' indexes, which create an entry per
> word.
>
> Most space-hungry are the '+Idx' indexes, as they create an entry for
> each substring down to a length of three, and '+Sn' adds one more for
> the soundex key.
>
> Cheers,
> - Alex
> --
> UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=3dunsubscribe
>
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Reply via email to