Dear list,

Sorry for asking this, but I am not getting my grammar to work
correctly in, and I think it is due to me not knowing enough. :-/

What I need is to be able to parse a sequence of n strings, separated
by a | -char (if  n > 1).

* IFF the string is within matching ' - chars, the string itself could
contain any character from the full UTF-8 charset. Including  the
'-char.
* If not within '-chars, the string could be only ascii letters and
numbers (this is the easy part of course. :-) )

So,
-  a|b|c should be thee (ascii) strings
-  'affffɫɱ'|'ɠð' should be two (unicode) strings
- as will should this be :  'affffɫɱ'|'ɠð''''''''''''''''''''''''''''''''

So, a full unicode (ok, UTF-8) string should be found to be terminated
when '-char is found just before a |, space or "end of line/string".

How do I do this?

/Fredrik

-- 
"Life is like a trumpet - if you don't put anything into it, you don't
get anything out of it."

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to