Aw: Re: Re: NBSP supposed to stretch, right?

2020-01-06 Thread Jörg Knappen
Festival season is over ...

 

I checked it out, LaTeX does the same for the input of an explicit no break space character.

 

--Jörg Knappen

 
 

Gesendet: Sonntag, 22. Dezember 2019 um 22:54 Uhr
Von: "Shriramana Sharma via Unicode" 
An: "Jörg Knappen" 
Cc: "Asmus Freytag" , "UnicoDe List" 
Betreff: Re: Re: NBSP supposed to stretch, right?


So I was wondering whether TeX only does this to the ~ input character or the actual NBSP Unicode character too?






Re: Re: NBSP supposed to stretch, right?

2019-12-22 Thread Shriramana Sharma via Unicode
So I was wondering whether TeX only does this to the ~ input character or
the actual NBSP Unicode character too?


Aw: Re: NBSP supposed to stretch, right?

2019-12-22 Thread Jörg Knappen
 


Well,

 

in TeX and LaTeX, the no break space (indicated by the active character ~ in TeX input files) is stretchable and stretches to a

normal inter-word space such that all inter-word spaces in a line are equal. But multiple no break spaces still add up to wider spaces

in the output unlike usual space tokens that are collapsed to one space token.

 

-- Jörg Knappen

 

Gesendet: Dienstag, 17. Dezember 2019 um 17:20 Uhr
Von: "Asmus Freytag via Unicode" 
An: unicode@unicode.org
Betreff: Re: NBSP supposed to stretch, right?



On 12/17/2019 2:41 AM, Shriramana Sharma via Unicode wrote:





On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode, <unicode@unicode.org> wrote:

Agree.
By the way, it is common practice to use multiple nbsp in a row to
create a larger span. In my opinion, it is wrong to replace fixed
width spaces with non-breaking spaces.
Quote from Microsoft Typography Character design standards:
«The no-break space is not the same character as the figure space. The
figure space is not a character defined in most computer system's
current code pages. In some fonts this character's width has been
defined as equal to the figure width. This is an incorrect usage of
the character no-break space.»



 

Sorry but I don't understand how this addresses the issue I raised.


You don't?

In principle it may be true that NBSP is not fixed width, but show me software that doesn't treat it that way.

In HTML, NBSP isn't subject to space collapse, therefore it's the go-to space character when you need some extra spacing that doesn't disappear.

I bet, in many other environments it was typically the only "other" space character, so it ended up overloaded.

My hunch is that it is too late at this point to try to promulgate a "clean" implementation of NBSP, because it would effectively change untold documents retroactively. So it would be a massively breaking change.

If you have a situation where you need really poor layout (wide inter-word spaces) to justify, the fact that a honorific in front of a name works more like it's part of the same word (because the NBSP doesn't stretch) would be the least of my worries. (Although, on lines where interword spaces are a reduced a bit, I can see that becoming counter-intuitive).

If you only fix this in software for high-end typography, you'd still have the issue that things will behave differently if you export your (plain) text. And you would have the issue of what to do when you want fixed spaces to be non-breaking as well (is that ever needed?).

A./







Re: NBSP supposed to stretch, right?

2019-12-21 Thread Shriramana Sharma via Unicode
On 12/19/19, James Kass via Unicode  wrote:
>
> There's a bug report for the LibreOffice application here...
> https://bugs.documentfoundation.org/show_bug.cgi?id=41652
> ...which shows an interesting history of the situation.

LOL two years ago almost to the date Shriramana Sharma seems to have
already *quoted* the Unicode Standard on this
(https://bugs.documentfoundation.org/show_bug.cgi?id=41652#c30):


The Unicode standard document http://unicode.org/reports/tr14/ clearly
states that:

When expanding or compressing interword space according to common
typographical practice, only the spaces marked by U+0020 SPACE and
U+00A0 NO-BREAK SPACE are subject to compression, and only spaces
marked by U+0020 SPACE, U+00A0 NO-BREAK SPACE, and occasionally spaces
marked by U+2009 THIN SPACE are subject to expansion. All other space
characters normally have fixed width.


But we have some people there on that bug saying that:


While Unicode is an important standard, it's only of secondary
importance to an office suite. Its primary goal is *not* creating a
reference comformant implementation of the standard; rather, it should
use the standard to the extent it needs to serve its users most.


which is a  approach in my eyes but well, that's how the real world
is on many things. Anyhow the above comment is continued as:


And if legacy requires that some statements of standard be violated to
keep existing documents intact, that should be that way, until a
better design is invented and implemented, which would make possible
to please both sides.


This means option #1 I mentioned earlier and which seems to already
have been discussed in the bug discussion: provide a per-document
option or at least a Word-compatibility option as to how to treat
NBSP.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: NBSP supposed to stretch, right?

2019-12-20 Thread James Kass via Unicode





On 2019-12-21 2:43 AM, Shriramana Sharma via Unicode wrote:

Ohkay and that's very nice meaningful feedback from actual
developer+user interaction. So the way I look at this going forward is
that we have four options:

1)

With the existing single NBSP character, provide a software option to
either make it flexible or inflexible, but this preference should be
stored as part of the document and not the application settings, else
shared documents would not preserve the layout intended by the
creator.




5)

Update the applications to treat NBSP correctly.  Process legacy data 
based on date/time stamp (or metadata) appropriately and offer users the 
option to update their legacy data algorithmically using proper 
non-stretching space characters such as FIGURE SPACE.


-

Options 1 and 5 have the advantage of not requiring the addition of yet 
more spacing characters to the Standard.




Re: NBSP supposed to stretch, right?

2019-12-20 Thread Shriramana Sharma via Unicode
On 12/21/19, Richard Wordingham via Unicode  wrote:
> On Fri, 20 Dec 2019 17:25:17 +0530
> Shriramana Sharma via Unicode  wrote:
>
>> I don't expect NBSP to ever disappear, because spaces disappear only
>> at linebreaks, and NBSP simply doesn't stand at linebreaks.
>
> I can certainly imagine someone writing "".

You don't need to go so far. Even the Unicode characters can be
entered: A0 0A (which makes for a nice smiley like pattern, two ears
besides two eyes ).

Obviously we are talking about *automatic* linebreaks. IIUC the point
about NBSP is that *it itself* doesn't break, whereas SP breaks up and
is *replaced* by a linebreak.

Nobody said anything about manual linebreak characters *following* a
space character, whether SP or NBSP or anything else.

I also just tested and noticed something related: in my wordprocessor
(LibreOffice Writer) when the cursor is near the end of a line and the
horizontal space remaining on that line is less than the nominal
advance width of the space, pressing space doesn't advance the cursor
(or maybe it does and I don't see it) irrespective of whether the
paragraph is left-aligned or justified, whereas inputting NBSP goes to
the next line, pulling the word before it along with it. This is
consistent with the current fixed-width NBSP behaviour of these
wordprocessors.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: [EXTERNAL] Re: NBSP supposed to stretch, right?

2019-12-20 Thread Shriramana Sharma via Unicode
On 12/21/19, Shriramana Sharma  wrote:
> 1)
>
> With the existing single NBSP character, provide a software option to
> either make it flexible or inflexible, but this preference should be
> stored as part of the document and not the application settings, else
> shared documents would not preserve the layout intended by the
> creator.

One thing I forgot: are there any possibilities that *both* behaviours
would be required in the same document?

To my imagination, I who expect NBSP to be flexible won't use it
between text and punctuation like those Word users, and probably they
won't use it like me.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: [EXTERNAL] Re: NBSP supposed to stretch, right?

2019-12-20 Thread Shriramana Sharma via Unicode
On 12/21/19, Murray Sargent  wrote:
> I checked with the Word team and they actually tried out stretching NBSP
> back in 2015 in the "good client" mode. But customer feedback was negative.
> The problem is that NBSP is used sometimes when stretching isn't wanted such
> as between the end of a question and the question mark or in multi-word
> trademarks or in italic expressions such as ad infinitum. Another example is
> Text«quotation»moretext. One doesn't want the « and » to
> be spaced apart from "quotation" for justification purposes.
>
> Conceivably Word should offer a special justification option to stretch
> NBSP, but user feedback has revealed that it's not a good default option.

Ohkay and that's very nice meaningful feedback from actual
developer+user interaction. So the way I look at this going forward is
that we have four options:

1)

With the existing single NBSP character, provide a software option to
either make it flexible or inflexible, but this preference should be
stored as part of the document and not the application settings, else
shared documents would not preserve the layout intended by the
creator.

2)

Consider that the non-stretching behaviour of wordprocessors (probably
following MS Word) is correct, and encode a new NBFSP non-breaking
flexible space. [I'm looking at that convenient hole at 2065.]

DTP software like InDesign/TeX (and browsers like Firefox, though web
content is assumed to be more fluid typographically) should then
ideally conform to this and potentially break their users' documents
(esp in the case of DTP).

3)

Consider that the stretching behaviour of DTP software like InDesign
is correct, and encode a new FWNBSP fixed-width non-breaking space [at
2065].

Wordprocessors should then ideally conform to this and potentially
break their users' documents.

4)

Leave alone the existing ambiguous behaviour of NBSP, and encode two
new characters [Supplemental Punctuation has space at 2E50…] for NBFSP
and FW-NBSP. Like the existing 2028 and 2029 Line and Paragraph
Separators with the annotation: “may be used to represent this
semantic unambiguously”.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: NBSP supposed to stretch, right?

2019-12-20 Thread Richard Wordingham via Unicode
On Fri, 20 Dec 2019 17:25:17 +0530
Shriramana Sharma via Unicode  wrote:

> So I never asked for NBSP to disappear. I said I want it to *stretch*.
> And to my mind "stretch" means to become wider than one's normal
> width. It doesn't include decreasing or disappearing width.

Don't spaces sometimes shrink?  I thought they did in some 'show codes'
modes.

> I don't expect NBSP to ever disappear, because spaces disappear only
> at linebreaks, and NBSP simply doesn't stand at linebreaks.

I can certainly imagine someone writing "".

Richard.



Re: NBSP supposed to stretch, right?

2019-12-19 Thread James Kass via Unicode



From our colleague’s web site,
http://jkorpela.fi/chars/spaces.html

“On web browsers, no-break spaces tended to be non-adjustable, but 
modern browsers generally stretch them on justification.”


Jukka Korpela then offers pointers about avoiding unwanted stretching.

and

“The change in the treatment of no-break spaces, though inconvenient, is 
consistent with changes in CSS specifications. For example, clause 7 
Spacing of CSS Text Module Level 3 (Editor’s Draft 24 Jan. 2019) defines 
the no-break space, but not the fixed-with spaces, as a word-separator 
character, stretchable on justification.”


So it appears that there’s no interoperability problem with HTML.

It seems that the widespread breakage which Asmus Freytag mentions is 
limited to legacy applications which persist in treating U+00A0 as the 
old “hard space” such as Word.  It also appears that Microsoft tried and 
failed to correct the problem in Word.  Perhaps they should try again.  
Meanwhile, in the absence of anything from Unicode more explicit than 
already recommended by the Standard, Shriramana Sharma might be well 
advised to continue to lobby the respective software people.  As more 
applications migrate towards the correct treatment of U+00A0, they are 
probably already running into interoperability problems with Microsoft 
Word and may well have already implemented solutions.




Re: NBSP supposed to stretch, right?

2019-12-18 Thread James Kass via Unicode



On 2019-12-17 12:50 AM, Shriramana Sharma via Unicode wrote:

I would have gone and filed this as a LibreOffice bug since that's the
software I use most, but when I found this is a cross-software
problem, I thought it would be best to have this discussed and
documented here (and in a future version of the standard).

There's a bug report for the LibreOffice application here...
https://bugs.documentfoundation.org/show_bug.cgi?id=41652
...which shows an interesting history of the situation.

One issue is whether to be Unicode compliant or MS-Word compliant. 
MS-Word had apparently corrected the bug with Word 2013 but had reverted 
to the incorrect behavior by the time Word 2016 rolled out.  On that 
page it's noted that applications like InDesign, Firefox, TeX, and 
QuarkXPress handle U+00A0 correctly.




Re: NBSP supposed to stretch, right?

2019-12-18 Thread James Kass via Unicode



U+0020 SPACE
U+00A0 NO-BREAK SPACE

These two characters are equal in every way except that one of them 
offers an opportunity for a line break and the other does not.


If the above statement is true, then any conformant application must 
treat/process/display both characters identically.


Responding to Asmus Freytag,
> Now, if someone can show us that there are widespread implementations 
that
> follow the above recommendation and have no interoperability issues 
with HTML

> then I may change my tune.

Can anyone show us that there are widespread implementations which would 
break if they started following the above recommendation?


Quoting from this HTML basics page,
http://www.htmlbasictutor.ca/non-breaking-space.htm

“Some browsers will ignore beyond the first instance of the non-breaking 
space.”

and
“Not all browsers acknowledge the additional instances of the 
non-breaking space.”


Fifteen or twenty years ago, we used NO-BREAK SPACE to indent paragraphs 
and to position text and graphics.  Both of those uses are presently 
considered no-nos because some browsers collapse NBSPs and because there 
are proper ways now to accomplish these kinds of effects.


The introduction of browsers which collapsed NBSP strings broke existing 
web pages.  Perhaps the developers of those browsers decided that SPACE 
and NO-BREAK SPACE are indeed identical except for line breaking.


Are there any modern mark-up language uses of SPACE vs NO-BREAK SPACE 
which would be broken if they follow the above recommendation?




Re: NBSP supposed to stretch, right?

2019-12-18 Thread Asmus Freytag via Unicode

  
  
On 12/17/2019 5:49 PM, James Kass via
  Unicode wrote:


  
  Asmus Freytag wrote,
  
  
  > And any recommendation that is not compatible with what the
  overwhelming
  
  > majority of software has been doing should be ignored (or
  only enabled on
  
  > explicit user input).
  
  >
  
  > Otherwise, you'll just advocating for a massively breaking
  change.
  
  
  It seems like the recommendations are already in place and the
  “overwhelming majority of software” is already disregarding them.
  

so they are dead letter and should be deprecated...

  
  I don’t see the massively breaking change here.  Are there any
  illustrations?
  
  
  If legacy text containing NON-BREAK SPACE characters is popped
  into a justifier, the worst thing that can happen is that the text
  will be correctly justified under a revised application.  That’s
  not breaking anything, it’s fixing it.  Unlike changing the
  font-face, font size, or page width (which often results in
  reformatting the text), the line breaks are calculated before
  justification occurs.
  
  
  If a string of NON-BREAK SPACE characters appears in an HTML file,
  the browser should proportionally adjust all of those space
  characters identically with the “normal” space characters.  This
  should preserve the authorial intent.
  
  
  As for pre-Unicode usage of NON-BREAK SPACE, were there ever any
  exlicit guidelines suggesting that the normal SPACE character
  should expand or contract for justification but that the NON-BREAK
  SPACE must not expand or contract?
  
  
  



  



Re: NBSP supposed to stretch, right?

2019-12-17 Thread James Kass via Unicode



Asmus Freytag wrote,

> And any recommendation that is not compatible with what the overwhelming
> majority of software has been doing should be ignored (or only 
enabled on

> explicit user input).
>
> Otherwise, you'll just advocating for a massively breaking change.

It seems like the recommendations are already in place and the 
“overwhelming majority of software” is already disregarding them.


I don’t see the massively breaking change here.  Are there any 
illustrations?


If legacy text containing NON-BREAK SPACE characters is popped into a 
justifier, the worst thing that can happen is that the text will be 
correctly justified under a revised application.  That’s not breaking 
anything, it’s fixing it.  Unlike changing the font-face, font size, or 
page width (which often results in reformatting the text), the line 
breaks are calculated before justification occurs.


If a string of NON-BREAK SPACE characters appears in an HTML file, the 
browser should proportionally adjust all of those space characters 
identically with the “normal” space characters.  This should preserve 
the authorial intent.


As for pre-Unicode usage of NON-BREAK SPACE, were there ever any exlicit 
guidelines suggesting that the normal SPACE character should expand or 
contract for justification but that the NON-BREAK SPACE must not expand 
or contract?




Re: NBSP supposed to stretch, right?

2019-12-17 Thread Asmus Freytag via Unicode

  
  
On 12/17/2019 11:31 AM, James Kass via
  Unicode wrote:

So it
  follows that any justification operation should treat NO-BREAK
  SPACE and SPACE identically.
And any recommendation that is not
compatible with what the overwhelming majority of software has
been doing should be ignored (or only enabled on explicit user
input).
Otherwise, you'll just advocating for a
massively breaking change.
NBSP has been supported since way before
Unicode. It's way past the point where we can legislate behavior
other than the de-facto consensus among implementations.
Now, if someone can show us that there are
widespread implementations that follow the above recommendation
and have no interoperability issues with HTML then I may change
my tune.
A./

  



Re: NBSP supposed to stretch, right?

2019-12-17 Thread Richard Wordingham via Unicode
On Tue, 17 Dec 2019 06:20:39 +0530
Shriramana Sharma via Unicode  wrote:

> Hello. I've just tested LibreOffice, Google Docs and MS Office on
> Linux, Android and Windows, and it seems that NBSP doesn't get
> stretched like the normal space character when justified alignment
> requires it.
> 
> Let me explain. I'm creating a document with the following text
> typeset in 12 pt Lohit Tamil with justified alignment on an A5 page
> with 0.5" margin all around:
> 
> ஶ்ரீமத் மஹாபாரதம் என்பது நமது தேசத்தின் பெரும் இதிஹாஸமாகும். இதனை
> இயற்றியவர் ஶ்ரீ வேத வ்யாஸர். அவரால் அனுக்ரஹிக்கப்பட்டவையான நூல்கள் பல.
> 
> The screenshot
> https://sites.google.com/site/jamadagni/files/temp/nbsp-not-expanding.png
> may be useful to illustrate the situation. Readers may try such
> similar sentences in any software/platform of their choice and report
> as to what happens.
> 
> Here the problem arises with the phrase ஶ்ரீ வேத வ்யாஸர். The word
> ஶ்ரீ is a honorific applying to the following name of the sage வேத
> வ்யாஸர், so it would seem unsightly to the reader if it goes to the
> previous line, so I insert an NBSP between it and the name. (Isn't
> there such a stylistic convention in English where Mr doesn't stand at
> the end of a line? I don't know.)

It's not widely taught in so far as it exists.  I would avoid
placingthe word at the end in wide columns, just as I suppress line
breaks in 'Figure 7' and '17 December', but I only apply it to short
adjuncts. However, I would find the use of narrower spacing somewhere
between acceptable and desirable.  Thai has a similar rule, where there
is generally no space between title and forename, but an obligatory
space between forename and surname.  To me, this is a continuation of
the principle that line-breaks within phrases make them more difficult
to understand.

> However, the phrase is shortly followed by a long word
> அனுக்ரஹிக்கப்பட்டவையான, which is too long to fit on the same line and
> hence goes to the next line, thereby increasing the inter-word spacing
> on its previous line significantly. But the NBSP after the honorific
> doesn't stretch, making the word layout unsightly.

The strategies to deal with this general problem in English are
hyphenation and abandoning justification.  In this particular case,
your text would benefit from using Knuth's algorithm for justification.

> IIUC, no-break space is just that: a space that doesn't permit a line
> break. This says nothing about it being fixed width.
> 
> Unicode 12.0 §2.3 on p 27 (55 of PDF) says:

You're assuming that TUS is a standard.  It's much more a collection of
influential recommendations.

Richard.



Re: NBSP supposed to stretch, right?

2019-12-17 Thread James Kass via Unicode



On 2019-12-17 10:37 AM, QSJN 4 UKR via Unicode wrote:

Agree.
By the way, it is common practice to use multiple nbsp in a row to
create a larger span. In my opinion, it is wrong to replace fixed
width spaces with non-breaking spaces.
Quote from Microsoft Typography Character design standards:
«The no-break space is not the same character as the figure space. The
figure space is not a character defined in most computer system's
current code pages. In some fonts this character's width has been
defined as equal to the figure width. This is an incorrect usage of
the character no-break space.»

The mention of code pages made me suspect that this quote was from an 
archived older web page, but it's current.  Here's the link:

https://docs.microsoft.com/en-us/typography/develop/character-design-standards/whitespace

Quoting from that same page,
"Advance width rule : The advance width of the no-break space should be 
equal to the width of the space."


So it follows that any justification operation should treat NO-BREAK 
SPACE and SPACE identically.




Re: NBSP supposed to stretch, right?

2019-12-17 Thread Asmus Freytag via Unicode

  
  
On 12/17/2019 2:41 AM, Shriramana
  Sharma via Unicode wrote:


  
  

  
On Tue 17 Dec, 2019, 16:09
  QSJN 4 UKR via Unicode, 
  wrote:

Agree.
  By the way, it is common practice to use multiple nbsp in
  a row to
  create a larger span. In my opinion, it is wrong to
  replace fixed
  width spaces with non-breaking spaces.
  Quote from Microsoft Typography Character design
  standards:
  «The no-break space is not the same character as the
  figure space. The
  figure space is not a character defined in most computer
  system's
  current code pages. In some fonts this character's width
  has been
  defined as equal to the figure width. This is an incorrect
  usage of
  the character no-break space.»

  



Sorry but I don't understand how this addresses
  the issue I raised.
  

You don't?
In principle it may be true that NBSP is not
fixed width, but show me software that doesn't treat it that
way.
In HTML, NBSP isn't subject to space
collapse, therefore it's the go-to space character when you need
some extra spacing that doesn't disappear.
I bet, in many other environments it was
typically the only "other" space character, so it ended up
overloaded.
My hunch is that it is too late at this
point to try to promulgate a "clean" implementation of NBSP,
because it would effectively change untold documents
retroactively. So it would be a massively breaking change.
If you have a situation where you need
really poor layout (wide inter-word spaces) to justify, the fact
that a honorific in front of a name works more like it's part of
the same word (because the NBSP doesn't stretch) would be the
least of my worries. (Although, on lines where interword spaces
are a reduced a bit, I can see that becoming counter-intuitive).
If you only fix this in software for
high-end typography, you'd still have the issue that things will
behave differently if you export your (plain) text. And you
would have the issue of what to do when you want fixed spaces to
be non-breaking as well (is that ever needed?).
A./
  
  



Re: NBSP supposed to stretch, right?

2019-12-17 Thread Shriramana Sharma via Unicode
On Tue 17 Dec, 2019, 16:09 QSJN 4 UKR via Unicode, 
wrote:

> Agree.
> By the way, it is common practice to use multiple nbsp in a row to
> create a larger span. In my opinion, it is wrong to replace fixed
> width spaces with non-breaking spaces.
> Quote from Microsoft Typography Character design standards:
> «The no-break space is not the same character as the figure space. The
> figure space is not a character defined in most computer system's
> current code pages. In some fonts this character's width has been
> defined as equal to the figure width. This is an incorrect usage of
> the character no-break space.»
>

Sorry but I don't understand how this addresses the issue I raised.