On Sat Jun 3 23:09:01 CDT 2017Sat Jun 3 23:09:01 CDT 2017 Markus Scherer wrote:
> I suggest you submit a write-up via http://www.unicode.org/reporting.html
>
> and make the case there that you think the UTC should retract
>
> http://www.unicode.org/L2/L2017/17103.htm#151-C19
The submission has
On Fri, Aug 4, 2017 at 3:34 AM, Mark Davis ☕️ via Unicode
wrote:
> FYI, the UTC retracted the following.
>
> [151-C19] Consensus: Modify the section on "Best Practices for Using FFFD"
> in section "3.9 Encoding Forms" of TUS per the recommendation in L2/17-168,
> for Unicode
On Mon, Aug 7, 2017 at 9:53 AM, Martin J. Dürst wrote:
> I just had a look at http://www.unicode.org/L2/L2017/17197-utf8-retract.pdf
> to use the test data in there for Ruby.
> I was under the impression from previous looks at it that it contained a lot
> of test data.
It
On Mon, May 15, 2017 at 6:37 PM, Alastair Houghton
<alast...@alastairs-place.net> wrote:
> On 15 May 2017, at 11:21, Henri Sivonen via Unicode <unicode@unicode.org>
> wrote:
>>
>> In reference to:
>> http://www.unicode.org/L2/L2017/17168-utf-8-recommend.pdf
&g
On Thu, May 18, 2017 at 2:41 AM, Asmus Freytag via Unicode
wrote:
> On 5/17/2017 2:31 PM, Richard Wordingham via Unicode wrote:
>
> There's some sort of rule that proposals should be made seven days in
> advance of the meeting. I can't find it now, so I'm not sure whether
>
On Tue, May 16, 2017 at 9:36 PM, Markus Scherer wrote:
> Let me try to address some of the issues raised here.
Thank you.
> The proposal changes a recommendation, not a requirement.
This is a very bad reason in favor of the change. If anything, this
should be a reason why
On Tue, May 16, 2017 at 10:22 AM, Asmus Freytag wrote:
> but I think the way he raises this point is needlessly antagonistic.
I apologize. My level of dismay at the proposal's ICU-centricity overcame me.
On Tue, May 16, 2017 at 10:42 AM, Alastair Houghton
On Tue, May 16, 2017 at 1:16 AM, Shawn Steele via Unicode
wrote:
> I’m not sure how the discussion of “which is better” relates to the
> discussion of ill-formed UTF-8 at all.
Clearly, the "which is better" issue is distracting from the
underlying issue. I'll clarify what I
On Tue, May 16, 2017 at 6:23 AM, Karl Williamson
<pub...@khwilliamson.com> wrote:
> On 05/15/2017 04:21 AM, Henri Sivonen via Unicode wrote:
>>
>> In reference to:
>> http://www.unicode.org/L2/L2017/17168-utf-8-recommend.pdf
>>
>> I think Unico
On Tue, May 16, 2017 at 1:09 PM, Alastair Houghton
<alast...@alastairs-place.net> wrote:
> On 16 May 2017, at 09:31, Henri Sivonen via Unicode <unicode@unicode.org>
> wrote:
>>
>> On Tue, May 16, 2017 at 10:42 AM, Alastair Houghton
>> <alast...@alastairs-p
On Tue, May 16, 2017 at 9:50 AM, Henri Sivonen wrote:
> Consider https://hsivonen.com/test/moz/broken-utf-8.html . A quick
> test with three major browsers that use UTF-16 internally and have
> independent (of each other) implementations of UTF-8 decoding
> (Firefox, Edge
In reference to:
http://www.unicode.org/L2/L2017/17168-utf-8-recommend.pdf
I think Unicode should not adopt the proposed change.
The proposal is to make ICU's spec violation conforming. I think there
is both a technical and a political reason why the proposal is a bad
idea.
First, the technical
On Wed, May 31, 2017 at 8:11 PM, Richard Wordingham via Unicode
<unicode@unicode.org> wrote:
> On Wed, 31 May 2017 15:12:12 +0300
> Henri Sivonen via Unicode <unicode@unicode.org> wrote:
>> I am not claiming it's too difficult to implement. I think it
>> inappropriat
I've researched this more. While the old advice dominates the handling
of non-shortest forms, there is more variation than I previously
thought when it comes to truncated sequences and CESU-8-style
surrogates. Still, the ICU behavior is an outlier considering the set
of implementations that I
On Mon, Jun 4, 2018 at 10:49 PM, Manish Goregaokar via Unicode
wrote:
> The Rust community is considering adding non-ascii identifiers, which follow
> UAX #31 (XID_Start XID_Continue*, with tweaks).
UAX #31 is rather light on documenting its rationale.
I realize that XML is a different case
On Wed, Jun 6, 2018 at 2:55 PM, Henri Sivonen wrote:
> Considering that ruling out too much can be a problem later, but just
> treating anything above ASCII as opaque hasn't caused trouble (that I
> know of) for HTML other than compatibility issues with XML's stricter
> stance, why should a
I was reading
https://www.unicode.org/versions/Unicode10.0.0/UnicodeStandard-10.0.pdf
on a Sony Digital Paper device and tried to scribble some notes and
make highlights but I couldn't. I still couldn't after ensuring that
the pen was charged and could write on other PDFs.
Since Evince told me
On Sat, Sep 8, 2018 at 7:36 PM Mark Davis ☕️ via Unicode
wrote:
>
> I recently did some extensive revisions of a paper on Unicode string models
> (APIs). Comments are welcome.
>
> https://docs.google.com/document/d/1wuzzMOvKOJw93SWZAqoim1VUl9mloUxE0W6Ki_G23tw/edit#
* The Grapheme Cluster Model
On Tue, Sep 11, 2018 at 2:13 PM Eli Zaretskii wrote:
>
> > Date: Tue, 11 Sep 2018 13:12:40 +0300
> > From: Henri Sivonen via Unicode
> >
> > * I suggest splitting the "UTF-8 model" into three substantially
> > different models:
> >
> >
Is the Editor's Draft of the Unicode Standard visible publicly?
Use case: Checking if things that I might send feedback about have
already been addressed since the publication of Unicode 10.0.
--
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
On Fri, Apr 20, 2018 at 12:16 PM, Martin J. Dürst
wrote:
> On 2018/04/20 18:12, Martin J. Dürst wrote:
>
>> There was an announcement for a public review period just recently. The
>> review period is up to the 23rd of April. I'm not sure whether the
>> announcement is up
We're about to remove the U+FFFD generation for the case where there
is no content between two ISO-2022-JP escape sequences from the WHATWG
Encoding Standard.
Is there anything wrong with my analysis that U+FFFD generation in
that case is not a useful security measure when unnecessary
transitions
On Tue, Oct 2, 2018 at 3:04 PM Mark Davis ☕️ wrote:
>
> * The Python 3.3 model mentions the disadvantages of memory usage
>> cliffs but doesn't mention the associated perfomance cliffs. It would
>> be good to also mention that when a string manipulation causes the
>> storage to expand or
reply. Why is excluding junk important?
> On Fri, Jun 8, 2018 at 11:07 AM, Henri Sivonen via Unicode wrote:
>>
>> On Wed, Jun 6, 2018 at 2:55 PM, Henri Sivonen wrote:
>> > Considering that ruling out too much can be a problem later, but just
>> > treating anythin
On Wed, Sep 12, 2018 at 11:37 AM Hans Åberg via Unicode
wrote:
> The idea is to extend Unicode itself, so that those bytes can be represented
> by legal codepoints.
Extending Unicode itself would likely create more problems that it
would solve. Extending the value space of Unicode scalar values
On Thu, Sep 12, 2019, 15:53 Christoph Päper via Unicode
wrote:
> ISHY/SIHY is especially useful for encoding (German) noun compounds in
> wrapped titles, e.g. on product labeling, where hyphens are often
> suppressed for stylistic reasons, e.g. orthographically correct
> _Spargelsuppe_,
/tr36/?
>
> Mark
>
>
> On Mon, Dec 10, 2018 at 11:10 AM Henri Sivonen via Unicode
> wrote:
>>
>> We're about to remove the U+FFFD generation for the case where there
>> is no content between two ISO-2022-JP escape sequences from the WHATWG
>> Encoding
27 matches
Mail list logo