Interesting outline of how the Australian Dictionary is being built.

Kelvin is always looking for extra help :-)

-----Forwarded Message-----

> From: Kelvin <[EMAIL PROTECTED]>
> To: Ken Foskey <[EMAIL PROTECTED]>
> Subject: Re: Re: [SLUG] OpenOffice Australian Dictionary -  work inprogress]
> Date: 18 Mar 2003 09:00:21 +1100
> 
> Hi Ken,
> 
> I'm not quite certain exactly what you meant in your statement as it is a
> bit concise.
> 
> However, I do think that what I am doing is actually what you are thinking.
> 
> I have actually taken the GB dictionary, had Kevin Hendricks unmunch it to
> give me the complete word list.
> 
> Then I've used Microsoft spell check to check all the words.  Some 12,000 or
> around 10% were shown as incorrect and these were removed. (Quite time
> consuming as I could not work out a way to automate this activity.)
> 
> >From this I created a Valid list and an Invalid list of words.  The Valid
> list was used to create the current alpha Australian dictionary.
> 
> The stage we are now at is getting people to contribute as that will spread
> the work load.  I suspect that not many people will acutally do anything
> which is why I've put a one week time schedule.  (No point waiting around).
> 
> In the mean time Jean Hollis Weber has provided a list of Australian place
> names which I will incorporate into the beta.  I have removed all duplicates
> from the place names list that are already in the alpha dictionary (keeping
> in mind that case sensitivity is important in place names).
> 
> I have gone back through the Invalid Word list and let the Microsoft spell
> check suggest variations. This has produced around another 3000 unique words
> from the 12000 invalid words.
> 
> This week as I read online articles in the media I am copying and pasting
> articles into Writer to perform spell checks against the alpha dictionary.
> Any words marked as incorrect (excluding words which are people names or I
> am not certain about) I am then collecting in a text file.
> 
> At the end of the week I will be combining everyone's found words, removing
> duplicates, testing against the alpha release of the dictionary and also
> testing against the Microsoft spell checker.
> 
> One test I would like to perform is to test the word list against another
> dictionary which is not the Microsoft one, as there is no guarantee that
> their dictionary is 100% correct either.  This however is probably a bit of
> over kill.
> 
> False hits are still possible because I don't believe as I have said, that
> the Microsoft dictionary is 100% correct.   My experience is there are not
> that many things that are perfect in this world ;-)
> 
> I would like to think that once we have the Australian dictionary live, it
> can evolve over time. That is incorrect words removed and new ones added.
> It is better to have a good starting point than to have no starting point at
> all.
> 
> The good thing about a dictionary is that people can always add words to
> their own custom dictionary if they  are not happy.  The bad thing about a
> dictionary is if it has words which are not correctly spelt.
> 
> If you haven't already given the Australian alpha dictionary a go, then you
> may wish to do so.  You will be surprised at how comprehensive it already is
> as a result of being based on the GB dictionary.
> 
> Kelvin
> 
> 
> ----- Original Message -----
> From: "Ken Foskey" <[EMAIL PROTECTED]>
> To: "Kelvin" <[EMAIL PROTECTED]>
> Sent: Tuesday, March 18, 2003 8:15 AM
> Subject: Re: Re: [SLUG] OpenOffice Australian Dictionary - work inprogress]
> 

-- 
Thanks
KenF
OpenOffice.org developer

-- 
SLUG - Sydney Linux User's Group - http://slug.org.au/
More Info: http://lists.slug.org.au/listinfo/slug

Reply via email to