Hello Bruno, Well, first things first: Congratulations for being accepted as on of the projects for the Google Summer of Code! :-)
->Lacci: Hi, Lacci. I'm not sure if you already noticed that we have started a dsicussion about grammar checking an API and not last to be mentioned integration of grammar checkers in OOo. The focus should currently be on the integration (i.e. whow will it look like to the user in the end?) especially if there are more than one grammar checkers available. I think this should be the first topic because we need to make clear where we want to go and identify the problems on the way before deciding on an API. So if you have time I would be glad if you can share your thoughts. > 1. Grammar Checker API, now: > > 1. It makes sense working with just one language now; so, foreign > words in the text should be ignored. >From the API view agreed! >From the UI view I'm a bit unsure here. Since currently different languages in one sentence being spell checked is working it looks a bit like a regression from the users point of view if that text would just be skipped. > 2. The grammar checker should run in a different thread to not block > OpenOffice. You mean when grammar checking is done automatically (in the background like automatic spell checking) only? > 3. The grammar checker should be able to check inside table cells, > text headers and footers, enumerations and text boxes (Drawing > Objects). Sure. The question is should it be able to do so because it knows of the existence of such objects and is able to retrieve/modify those on it's own? Or should the existence of such objects be completely hidden to the grammar checker? For example by means of an abstract API to iterate through and modify the text of a document. And pushing that question one step further: Is the grammar checkers implementation to iterate through the text or should there be a different object that iterates through the text and calls the grammar checker to process it? > 4. The grammar checker should determine end of the sentences, because > it is not so trivial (e.g., abbreviations). So, OpenOffice should > just provide to the grammar checker an entire block of text, like > a paragraph. Doing it this way would of cause be easiest from the applications view. First it does not need to determine the end of a sentence and secondly paragraphs are the easiest units to access. But I somewhat doubt the ability of a grammar to identify the end of sentence in a mixed language text. For example if an English grammar checker encounters the upside-down question-mark following the Spanish word at the end. Thus I'm wondering if the API should allow for a suggested-end-of-sentence when calling the grammar checker. Thus if the implementation encounters unknown characters it has at least a hint. BTW: The I18N break-iterator is not that bad with abbreviations. I think it has a list of those. But citations and similar things might pose a huge problem to it. And another question would be: Having the grammar checker being called with sentences, does it mean when an error is found the whole paragraph is presented to the user (could be really large!) or does the UI only display the sentence of where the error occurred? Displaying less than a sentence seems somewhat bad to me because sometimes the user will possibly like to solve an error by rearranging the sentence. And quiting the UI because only the wrong word was displayed seems to be annoying. And allowing the original document to be modified parallel to the dialog being display may be somewhat troublesome to implement. > 5. OpenOffice should be able to replace the wrong sentences. ;-) > 6. I think we should create an unified User Interface, for any > grammar checker use it. +1. Of course this will not prevent someones grammar checker to come along with it's own UI. It only makes the implementation easier if the UI is already there and to the user all the grammar checker will look the same. Thus avoiding a possible source of confusion. > 7. Automatic checking should run in background and marking the wrong > sentences with a wavy line. It could be enabled and disabled, like > Spell Checker. +1. Someone once mentioned the idea of at least two different kind of lines. One for what the grammar checker knows for sure is wrong. And the other one for "this is probably wrong" (e.g. outdated words like "thy" or "thee" in English). This of course going along with an option that allows the user to specify if he likes to have both types displayed or only the I'm-100%-sure-it-is-wrong parts. The reasoning was AFAIR that it is most annoying to the user to get errors reported that are no errors. I found that idea quite compelling... > 8. The API should provide a paragraph (for example) to grammar > checker and this one should return a list. If there is no mistake > in this paragraph, the list should be empty, else the list should > contain: A list of what? Suggestions on how to correct the first encountered error? Or did you meant a list of all errors? Or even sth else? > 1. Where is the mistake in the paragraph (initial index + final > index). > 2. A list of suggestions to correct that mistake (this list can > be empty if checker is not prepared to guess). > 3. A comment about mistake, e.g. what a grammar book should say > about it. Having listed point 1. here as part of the list seems to suggest that a list of all errors was meant to be returned... When I talked about this to people implementing grammar checkers last year all of them said to stop at the first error. Since when that error was corrected the whole sentence will have to be checked again. Thus there would be no need for further errors. Also (as sometimes happen with compilers) consider one single error to trigger reports of several errors following it. If that one gets fixed all the other ones will vanish as well. Thus the list may already be obsolete when the first error got fixed. > 2. Grammar Checker API, future: > > 1. Let's suppose it's possible to manage several languages in a text > and there is a Language Guessing API. Then, when OpenOffice > discover language of a sentence, it automatically loads grammar > checker to correspondent language. Here it is a bit like the snake biting it's tail: How is the language guessing to be presented with a sentence to operate on (in order to define which grammar checker is to be used), when the grammar checker is already required to identify the end of the sentence? Either it is only guessing the language of the paragraph, which may constitute of several complete-sentences-in-various-languages. Or we still need the I18N breakiterator (or sth similar) to identify the sentence. Regards, Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
