[htdig] Spelling Help

2001-01-18 Thread David Adams

I am trying to do what I can to aid those with spelling difficulties perform
searches on our web pages.
This was triggered by seeing in the htsearch log that attempts to find
"accomodation" were finding some pages, but not the important ones (where it
is spelt correctly)!

Also this University has a commitment to supporting disabled students,
including those with dyslexia.

I would like to ask:

1)What have other sites done to address this problem?  (Spell checking
and correcting our own
pages is not possible at present, and may never be.)

2)Can anybody recommend a _good_ (UK English) spell checker for IRXIX
6.5?
(The IRIX spell command does not know a lot of important words, such
as "midwifery", and I can't
  figure out how to addend to the dictionary.)
A spell checker that could suggest words (as do the spell checkers
in word processors, etc.)
would be wonderful.

--
David Adams
Computing Services
Southampton University




To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Spelling Help

2001-01-18 Thread Geoff Hutchison

At 1:34 PM + 1/18/01, David Adams wrote:
1)What have other sites done to address this problem?  (Spell checking
and correcting our own

Use good fuzzy methods, including the synonym file. We are working on 
additional fuzzy matching code, but of course if anyone can come up 
with sample code that produces a list of suggestion words from an 
input, we can probably port it.

2)Can anybody recommend a _good_ (UK English) spell checker for IRXIX
6.5?

Yes. Try ispell with the UK dictionaries.

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Spelling Help

2001-01-18 Thread Gilles Detillieux

According to Geoff Hutchison:
 At 1:34 PM + 1/18/01, David Adams wrote:
 1)What have other sites done to address this problem?  (Spell checking
 and correcting our own
 
 Use good fuzzy methods, including the synonym file. We are working on 
 additional fuzzy matching code, but of course if anyone can come up 
 with sample code that produces a list of suggestion words from an 
 input, we can probably port it.
 
 2)Can anybody recommend a _good_ (UK English) spell checker for IRXIX
 6.5?
 
 Yes. Try ispell with the UK dictionaries.

Back in October, Greg Holmes posted a python wrapper script for htsearch,
which used ispell to suggest alternative spellings.  The thread that
ensued is at http://www.htdig.org/mail/2000/10/index.html#295

The ispell package is GNU software, so it should port to IRIX easily
enough, I'd think, and the dictionaries are very customisable.

-- 
Gilles R. Detillieux  E-mail: [EMAIL PROTECTED]
Spinal Cord Research Centre   WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:(204)789-3930


To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html




Re: [htdig] Spelling Help

2001-01-18 Thread Dave Salisbury


okay, here's one for the gurus.

I'd like to be able to preserve user state, which is held in the query string.
So my idea is to return just the urls from a search that match the state
of the user.  Basically, we have a ?lang=en  or ?lang=fr, and since many
of our pages are not translated yet, it's the same page regardless
of the language they ask for..  So any search will return 2 pages, 
( the same page, but the urls that differ in the query string )
one for english, and one for french.  I would like for the search to only
return one or the other, even though both should be indexed.

Something like a bad_querystr attribute would help perhaps, 
but that is only for the indexing, not
for searching, and I also need to set this unknow attribute dynamically.

filtering the search's output with my own parsing could be gross.
There must be a better way.

any ideas?

dave



To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  http://www.htdig.org/mail/menu.html
FAQ:http://www.htdig.org/FAQ.html