Marcin Miłkowski wrote: > There is at least one, IMHO very useful, rule that checks if brackets, > quotation marks etc. come in pairs in the text. Obviously, you want to > check this in a whole paragraph, as quotations often contain many > sentences. Now, the problem is that if I tokenize the text on the > sentence level, I get next bits of paragraph text with every call, and > that makes it very hard to track the number of unmatched quotation > marks.
IMHO checking whole paragraphs is only gradually better than checking sentences only. Quotations even may contain several paragraphs and the only way to catch missing matching quotations here is always checking the whole text, obviously not a useful solution. Basically there is nothing wrong with maintaining information about a paragraph (here: "opening quote found") and postpone its judgement until more information is available (here: waiting for a matching closing quote, maybe in a later paragraph). What we then will need is a way ( = API ) how to report possible errors found this way. Regards, Mathias -- Mathias Bauer (mba) - Project Lead OpenOffice.org Writer OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS Please don't reply to "nospamfor...@gmx.de". I use it for the OOo lists and only rarely read other mails sent to it. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lingucomponent.openoffice.org For additional commands, e-mail: dev-h...@lingucomponent.openoffice.org