Re: Private Use areas

Mark E. Shoulson via Unicode Fri, 31 Aug 2018 13:05:01 -0700

On 08/28/2018 11:58 AM, William_J_G Overington via Unicode wrote:

Asmus Freytag wrote:

There are situations where an ad-hoc markup language seems to fulfill a need that is not 
well served by the existing full-fledged markup languages. You find them in internet 
"bulletin boards" or services like GitHub, where pure plain text is too 
restrictive but the required text styles purposefully limited - which makes the syntactic 
overhead of a full-featured mark-up language burdensome.

I am thinking of such an ad-hoc special purpose markup language.

I am thinking of something like a special purpose version of the FORTH computer 
language being used but with no user definitions, no comparison operations and 
no loops and no compiler. Just a straight run through as if someone were typing 
commands into FORTH in interactive mode at a keyboard. Maybe no need for spaces 
between commands. For example, circled R might mean use Right-to-left text 
display.

That starts to sound no longer "ad-hoc", but that is not a well-definedterm anyway. You're essentially describing a special-purpose markuplanguage or protocol, or perhaps even programming language. Which isquite reasonable; you should (find some other interested people and)work out some of the details and start writing up parsers and such

I am thinking that there could be three stacks, one for code points and one for 
numbers and one for external reference strings such as for accessing a web page 
or a PDF (Portable Document Format) document or listing an International 
Standard Book Number and so on. Code points could be entered by circled H 
followed by circled hexadecimal characters followed by a circled character to 
indicate Push onto the code point stack. Numbers could be entered in base 10, 
followed by a circled character to mean Push onto the number stack. A later 
circled character could mean to take a certain number of code points (maybe 
just 1, or maybe 0) from the character stack and a certain number of numbers 
(maybe just 1, or maybe just 0) from the number stack and use them to set some 
property.

It could all be very lightweight software-wise, just reading the characters of 
the sequence of circled characters and obeying them one by one just one time 
only on a single run through, with just a few, such as the circled digits, each 
having its meaning dependent upon a state variable such as, for a circled 
digit, whether data entry is currently hexadecimal or base 10.

I still don't see why you're fixated on using circled characters. You'realready dealing with a markup-language type setup, why not do what othermarkup schemes do? You reserve three or four characters and use them todesignate when other characters are not being used in their normal sensebut are being used as markup. In XML, when characters are inside '<>'tags, they are not "plain text" of the document, but they mean otherthings—perhaps things like "right-to-left" or "reference this web page"and so forth, which are exactly the kinds of things you're talking abouthere. If you don't want to use plain ascii characters because then youcouldn't express plain ascii in your text, you're left with exactly thesame problem with circled characters: you can't express circledcharacters in your text. While that is a smaller problem, it can beeliminated altogether by various schemes used by XML or RTF orlightweight markup languages. Reserve a few special characters to givemeanings to the others, and arrange for ways to escape your handful ofreserved characters so you can express them. More straightforward tosay "you have to escape <, >, and & characters" than to say "you have toescape all circled characters."

Anyway, this is clearly a whole new high-level protocol you need (orwant) to work out, which would *use* Unicode (just like XML and JSONdo), but doesn't really affect or involve it (Unicode is all about the"plain text". Kind of getting off-topic, but get some people interestedand start a mailing list to discuss it. Good luck!


~mark

Re: Private Use areas

Reply via email to