Matthew Strawbridge wrote:
I agree that determining the ends of sentences is non-trivial. However, I think that this is a good reason to do it once in OOo instead of each grammar checker having to figure it out manually. OOo already maintains a list of abbreviations (ending with .), so presumably this could be used. If the user adds custom abbreviations, these would then automatically be picked up by the sentence splitter, which wouldn't happen if each grammar checker implemented its own.

Maybe this is true for English, but not for Polish. I believe that most languages are not covered as of now. Moreover, in some languages, segmentation cannot be simply punctuation-based (Asian languages like Japanese are very hard to segment meaningfully). In the future, it would be ideal to implement the SRX standard which is an emerging segmentation standard in the translation industry. For specification, see http://www.lisa.org/standards/srx/

Now, SRX could be implemented on the grammar checker level, or on the Ooo level - using SRX would help grammar checker developers get exactly what they want. So if you want Ooo-level segmentation, the only option is to start implementing SRX, which would include abbreviations we already. Otherwise, this mechanism would be still bad for languages with quite different punctuation schemas.

I haven't seen this mentioned explicitly, but I think there should be a menu option Tools, Grammar Check to launch the grammar checker to check through the whole document from beginning to end (as the spell checker can do).

+1.

It might be a good idea to have two levels of comments -- one brief and one detailed. The view of the detailed portion could then be toggled on and off in the UI.

+1.

I think that the inflexibility of the MS Office grammar checker is one of the reasons that lots of people turn it off. They simply don't trust what it says. Many grammar rules (such as using 'that' for restrictive clauses and 'which' for non-restrictive ones) are really recommendations, particularly in British English. The lack of a description in the UI certainly doesn't help either -- people are told that things are wrong, but it is not explained _why_ they are wrong.

Definitely, yes. MS Office Polish grammar checker is almost all but false alarms.

My suggestion would be to create the absolute simplest API to start with. Use full stops to determine sentence breaks, even though some of these will be wrong.

Not some, most will be wrong. You should also use "?", "!" in most European languages. But without SRX it's all like trying to reinvent the wheel. Implementing SRX is not trivial, but I think keeping to the standards is the best way to do open source projects even better ;)

Best,
Marcin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to