[lingucomponent-issues] [Issue 92383] submit new en_US.dic without the errors

2011-02-16 Thread aardvark12
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=92383





--- Additional comments from aardvar...@openoffice.org Wed Feb 16 17:22:20 
+ 2011 ---
Created an attachment (id=75850)
autocorrect hyphenates, compound words, grammar errors; word list


-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: issues-h...@lingucomponent.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[lingucomponent-issues] [Issue 92383] submit new en_US.dic without the errors

2011-02-16 Thread aardvark12
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=92383





--- Additional comments from aardvar...@openoffice.org Wed Feb 16 17:34:02 
+ 2011 ---
PROBLEM WORDS, submit autocorrect suggestions in plain text

A number of problem words were created by Microsoft's spelling mistakes. Because
of Microsoft's total domination in the United States, these errors have been
compounded a billions times in the past dozen years. Some words are entering the
language. The surprising thing is that more words haven't been subverted, but
that most intelligent people still continue to write point-blank instead of
the ugly Microsoft pointblank. Microsoft handled things by just removing
essential hyphens. After all, no sense paying for programmers, or for anyone who
knew even a smattering of the English language. These problem words have to be
checked every year. The Oxford English Dictionary (OED) monitors ten thousand
transitional words on a daily basis.

The OED now accepts the following nouns as one word, and they may be included in
en_US.dic (they are missing from my version):

airbag
airbase
lifebuoy
waterhole - one word OED, all other dictionaries, two words

Note the following key words:

all right - the only correct usage. alright not acceptable
point-blank (Microsoft's word, pointblank, not acceptable)

The preferred choice for cafe is now without an accented e.

The OED has razor blade. Everyone else uses razorblade.

The American Heritage Dictionary has mockup, mahjong, and housepainter.
OED and others use mock-up. Almost everyone uses mah-jongg and house
painter. (May want to put mah-jongg in the autocorrect list.)

The OED has hot plate and hot pot, but Collins English Dictionary has these
as one word.

Here are some words I really hate to look up every year. Real dictionaries list
them as two words. This isn't a complete list, but these words are noted in the
attached autocorrect suggestions file.

bean sprouts
black light
coal mine
con man, con men
drift net
drop kick
fire truck
floor show
fly swatter [one dic. has flyswatter as second-rate choice]
gun battle
hair dryer
ice pack
land mine
love child
milk shake
nose cone
school day
six-shooter [hyphenation is correct, as here]
staff room [British usage, a teachers' lounge, one word]
tea bag
tea leaves
trash can
water mill

Hunspell does not handle hyphenated words, but these can be substituted using
the autocorrect feature of Open Office. Also some accented words aren't found by
Hunspell, and they are in the list, as well as recommendations for various word
substitutions. The fifteen-page plain text file is attached. 


-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: issues-h...@lingucomponent.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[lingucomponent-issues] [Issue 92383] submit new en_US.dic without the errors

2011-02-11 Thread aardvark12
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=92383





--- Additional comments from aardvar...@openoffice.org Fri Feb 11 17:14:36 
+ 2011 ---
Created an attachment (id=75820)
February 11, 2011 update of en_US.dic;  146,540 words


-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: issues-h...@lingucomponent.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org



[lingucomponent-issues] [Issue 92383] submit new en_US.dic without the errors

2011-02-11 Thread aardvark12
To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=92383





--- Additional comments from aardvar...@openoffice.org Fri Feb 11 17:18:53 
+ 2011 ---
The word list was pruned of specialized or obscure words, particularly if those
might interfere with finding more common words. As example, 'chough' and
'scoter' are birds, but most people will be interested in typing 'cough' or
'scooter.' Sometimes choices aren't clear. 'Whicker' is a horse's whinny, but
perhaps there is a conflict with 'wicker.' 'Whicker' was removed. Often a
dictionary will list plurals for words ending in 'o' as either -os or -oes, or
words ending in 'a' as -as or -ae. If a dictionary separates the choices with
'or' then both plurals have equal weight, but a spellchecker may help a writer's
consistency by only listing the first choice.

It has not escaped my attention that removing words helps to make room for later
additions, as a number of new words and proper nouns need to be added to keep
the word list current. (May 6, 2009 version, 150,240 words. Current version
146,540 words.) The words 'shalt' and 'spake' are now in the list, but have been
marked with an exclamation point for NO SUGGEST.

Hunspell is good at dividing long words into two, and checking each portion,
useful for a Hungarian spellchecker. It is unable to handle hyphenated words.
For this spellchecker to function properly, users need to install an autocorrect
word list in Open Office, so that when 'paperclipped' is typed, 'paper-clipped'
is automatically substituted. This is also true for some accented words, so that
typing 'elan' produces 'élan.' Unfortunately, Open Office doesn't use the
autocorrect feature this way.

Sources, listed in order of preference.

1) http://www.thefreedictionary.com/
American Heritage Dictionary, and Collins English Dictionary

2) http://dictionary.reference.com/
Random House Dictionary, Collins English Dictionary, Webster's Unabridged 
Dictionary

3) http://www.merriam-webster.com/
Merriam-Webster Dictionary

4) http://oxforddictionaries.com/?attempted=true
The Oxford English Dictionary

Dictionaries often disagree on compound words, or on spelling. Generally the
Random House Dictionary is very good, but it gives a spelling of 'mujahedin.'
Going to an Arabic source to clarify matters only adds to the confusion, as that
site gives seven possible spellings. Other dictionaries use the word
'mujahideen,' so that seems preferable. Because of past problems with WordNet, I
don't accept words with only this single source. WordNet gathers words from the
web. This says nothings about the way people write, only that people are blindly
reproducing the questionable Microsoft spellchecker, which has total dominance
in the U.S.

February 11, 2011

-
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

-
To unsubscribe, e-mail: issues-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: issues-h...@lingucomponent.openoffice.org


-
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org