Marcin Miłkowski wrote:

> There is at least one, IMHO very useful, rule that checks if brackets, 
> quotation marks etc. come in pairs in the text. Obviously, you want to 
> check this in a whole paragraph, as quotations often contain many 
> sentences. Now, the problem is that if I tokenize the text on the 
> sentence level, I get next bits of paragraph text with every call, and 
> that makes it very hard to track the number of unmatched quotation 
> marks.

IMHO checking whole paragraphs is only gradually better than checking
sentences only. Quotations even may contain several paragraphs and the
only way to catch missing matching quotations here is always checking
the whole text, obviously not a useful solution.

Basically there is nothing wrong with maintaining information about a
paragraph (here: "opening quote found") and postpone its judgement until
more information is available (here: waiting for a matching closing
quote, maybe in a later paragraph). What we then will need is a way ( =
API ) how to report possible errors found this way.

Regards,
Mathias

-- 
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "nospamfor...@gmx.de".
I use it for the OOo lists and only rarely read other mails sent to it.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: dev-h...@lingucomponent.openoffice.org

Reply via email to