The original message was received at Tue, 20 Mar 2001 13:07:11 -0800 (PST)
from localhost [127.0.0.1]

   ----- The following addresses had permanent fatal errors -----
<[EMAIL PROTECTED]>

   ----- Transcript of session follows -----
... while talking to [172.20.4.1]:
>>> RCPT To:<[EMAIL PROTECTED]>
<<< 550 <[EMAIL PROTECTED]>... User unknown
550 <[EMAIL PROTECTED]>... User unknown


-- Attached file included as plaintext by Listar --

Reporting-MTA: dns; mailgw.remedy.com
Received-From-MTA: DNS; localhost
Arrival-Date: Tue, 20 Mar 2001 13:07:11 -0800 (PST)

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.1.1
Remote-MTA: DNS; [172.20.4.1]
Diagnostic-Code: SMTP; 550 <[EMAIL PROTECTED]>... User unknown
Last-Attempt-Date: Tue, 20 Mar 2001 13:07:12 -0800 (PST)


-- Attached file included as plaintext by Listar --

Return-Path: <[EMAIL PROTECTED]>
Received: from mailgw.remedy.com (localhost [127.0.0.1])
        by mailgw.remedy.com (8.9.3/8.9.3) with ESMTP id NAA03941
        for <[EMAIL PROTECTED]>; Tue, 20 Mar 2001 13:07:11 -0800 (PST)
Received: from bz2.apple.com (bz2.apple.com [17.254.0.82])
        by mailgw.remedy.com (8.9.3/8.9.3) with ESMTP id NAA03935
        for <[EMAIL PROTECTED]>; Tue, 20 Mar 2001 13:07:10 -0800 (PST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
        by bz2.apple.com (8.9.3/8.9.3) with ESMTP id LAA15632;
        Tue, 20 Mar 2001 11:43:20 -0800 (PST)
Received: (from agent@localhost)
        by unicode.org (8.9.3/8.9.3) id LAA25402;
        Tue, 20 Mar 2001 11:08:42 -0800 (GMT-0800)
Message-Id: <[EMAIL PROTECTED]>
Errors-To: [EMAIL PROTECTED]
Mime-Version: 1.0
Content-Type: text/plain;
        charset="iso-8859-1"
X-UML-Sequence: 18861 (2001-03-20 19:07:02 GMT)
From: Marco Cimarosti <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Cc: Unicode List <[EMAIL PROTECTED]>
Date: Tue, 20 Mar 2001 11:06:53 -0800 (GMT-0800)
Subject: RE: Unicode editing (RE: Unicode complaints)

> If it were me, I would keep two copies who update each other 
> when that's needed.

I am not sure what you mean, but it sounds very similar to what you wanted
to avoid.

> Or perhaps we should only keep the active paragraph in your
> WYSIWYG format?

Makes sense. Also, all lines (or paragraphs) that have never been edited
could remain in "proper Unicode", so that the mention later are minimized.

> I was talking about something else: we are not allowed to 
> remove anything from a file we're playing with, or we
> will become non-conformant.

I am not sure, but my interpretation of conformance is very different: we
are not allowed to or change anything from a file if:
a) we claim that we do not change the text or
b) we don't know what we are changing.

Condition (a) clearly doesn't apply to applications whose purpose *is* to
change the text such as editors, word processors, etc. (BTW, I think that it
can be further relaxed: according to conformance rule C10 (section 3.1 of
the book), even a read-only application is allowed to change the encoding
form or to do canonical normalizations).

Condition (b) is what really needs to be discussed here, I think.

> An old editor that changes Lam+ZWJ+ZWNJ+ZWJ+Alef to Lam+Alef, simply 
> because it sees that they don't make any sense here will is
non-conformant.
> And standards people would say that they've warned us: You should keep the
> code points intact, unless specifically asked by the user.

This is where I differ.

Imagine a text editor that removes blank spaces at the end of lines, or that
converts tabs to spaces (or vice versa), or that converts all occurrences of
U+000D to the pair U+000D U+000A (or vice versa), or that automatically
expands keywords and identifiers for a certain programming language, or that
automatically capitalizes the first letter of a sentence, or that removes or
adds blanks when cutting and pasting...

These are all things that many programs actually do. Did I specifically ask
the program to do each one of these action? Of course not -- they are
designed to be automatic -- so are all these kinds of things non-conformant?

(Please, don't open a debate about whether these automatisms are good or
evil: this is not the point. We all know that some people hate this kind of
feature, and some other people cannot live without them, while most people
simply want to be able to activate or deactivate them at will.)

> you should keep everything you don't know about, including the
> information for adjacent bidi runs with the same level: the
> next version may assign some meaning to them, and you will
> become non-conformant as soon as that version is out.

I don't know. If I strip controls that embed LTR text in other LTR text, you
cannot say that I don't know what I am touching. I do know what that thing
is: "LTR text embedded inside other LTR text", according to the current
rules.

Now, we can dispute whether removing such a kind of embedding is a brilliant
optimization or a totally silly idea.

But I don't se how interpreting something according to the rules can
possibly be against the rules.

Of course, if the rules change, the program will not be conformant to the
*new* rules -- but still conformant to the *old* ones that it was designed
for. (Sort of a Japanese soldier abandoned in a desert island that doesn't
know that the war is over :-)

This would be different than, say, removing an unassigned character from the
string, because there is a specific conformance rule *now* (C10) that
specifies that this is forbidden.

BTW, conformance rule C13 is specifically about bidi, and it sounds very
elastic to me... Perhaps even too much!

It basically says that that, *if* an application chooses to support
bidirectional text *or* bidi embedding levels, it shall *either* present the
text "*as* if it the bidirectional algorithm had been applied", *or* use a
"higher-level protocol".

> (Or possibly I haven't got the conformance idea yet...)

Neither I do. Perhaps some assistance by the adults is needed here.

_ Marco



Reply via email to