Unicode UAX 14 (Line Breaking Properties) also has a bit to say on this topic of line separators

From http://www.unicode.org/reports/tr14/

BK - Mandatory Break (A) - (normative)

Explicit breaks act independently of the surrounding characters.

000C
        
FORM FEED

Form Feed separates pages. The text on the new page starts at
> the beginning of the line. No paragraph formatting is applied.

2028

LINE SEPARATOR

The text after the Line Separator starts at the beginning of the line.
> No paragraph formatting is applied.

This is similar to HTML <BR>


2029
        

PARAGRAPH SEPARATOR

The text of the new paragraph starts at the beginning of the line.
> Paragraph formatting is applied.

“NEW LINE FUNCTION (NLF)”


New line functions provide additional explicit breaks.
> They are not individual characters, but are expressed as sequences
> of control characters NEL, LF, and CR. What particular sequence(s)
> form a NLF depends on the implementation and other circumstances
> as described in [Unicode] Section 5.8, Newline Guidelines.

If a character sequence for a new line function contains more than
> one character, it is kept together. The default behavior is to break
> after LF or CR, but not between CR and LF. Two additional line
> breaking classes have been added for convenience in this operation.


Mandatory breaks:

LB 3a Always break after hard line breaks (but never between CR and LF).

BK !

LB 3b Treat CR followed by LF, as well as CR, LF and NL as hard line breaks

CR × LF
CR !
LF !
NL !



-- -- Andy Heninger [EMAIL PROTECTED]





Reply via email to