Hans Aberg <[EMAIL PROTECTED]> writes:

   On 14 Jun 2007, at 14:46, Alessandro Di Marco wrote:

   > In american english sentences like the following ones are
   > quite common:
   >
   > 1) "blah blah blah". Some more blah...
   > 2) "blah blah blah." Some more blah...
   > 3) "blah blah blah.  Some more blah...
   >
   > Now, the 3rd gives out the problem. For example, here it is an  excerpt
   > fooling
   > my parser:
   >
   > Party chairwoman Hazel Blears was accused by the Conservatives of
   > scapegoating
   > immigrants after saying in an Independent on Sunday newspaper  interview: 
"We
   > have got areas in Salford where private landlords are letting  properties
   > with
   > 10 and 12 people in there.  "Now, the community doesn't object to  the 
people
   > -
   > they object to the exploitation and the fact that that leads to  people 
being
   > on
   > the street drinking, anti-social behaviour."  Welsh Secretary Peter  Hain,
   > meanwhile accused Home Secretary John Reid of "fanning up" last  week's row
   > over
   > stop-and-question powers possibly being rolled out across the UK.

   One way around is feeding a UTF-8 .ly file to Flex, and require that  the
   proper Unicode “...” be used, i.e. U+201C & U+201D. When U+201C  arrives, in
   the lexer, start parsing a quotation string. If the  closing U+201D has not
   arrived when the paragraph, or whatever block  without the construct cannot
   survive, closes, issue an error.

Thanks for the suggestion; unfortunately it is not viable because the text is
plain ascii. Considering the spaces around the quotes I could get a similar
effect, but there should be something better... does it?

Thanks again.
Alessandro

-- 
The best inheritance a parent can give his children is a few minutes of his
time each day. - O. A. Battista



_______________________________________________
[email protected] http://lists.gnu.org/mailman/listinfo/help-bison

Reply via email to