On Thu, Jun 11, 2015 at 1:27 PM, Steven D'Aprano wrote:
> On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote:
> [...]
>>> Why do the subtitles contain ZWNBSP in the first place? Surely they're
>>> not English subtitles?
>>
>> No, they're not :) The character comes up in the Cantonese and
>> Japane
On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote:
[...]
>> Why do the subtitles contain ZWNBSP in the first place? Surely they're
>> not English subtitles?
>
> No, they're not :) The character comes up in the Cantonese and
> Japanese subs for Once Upon A December.
>
> http://youtu.be/CEpcUeWP0b
On Thu, Jun 11, 2015 at 1:18 PM, wrote:
> On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote:
>> http://youtu.be/CEpcUeWP0bg
>> http://youtu.be/WFZAaHrHens
>
> An example of the actual subtitle text would be more useful than a
> youtube link to the video, since we're unlikely to be able to see
On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote:
> http://youtu.be/CEpcUeWP0bg
> http://youtu.be/WFZAaHrHens
An example of the actual subtitle text would be more useful than a
youtube link to the video, since we're unlikely to be able to see what
context the character appears in if our client
On Thu, Jun 11, 2015 at 12:26 PM, Steven D'Aprano wrote:
> No, despite the name, that is not a space character, it is a formatting
> character. Due to Unicode's stability policy, the name is stuck forever,
> but it should not be treated as a space character:
>
> py> unicodedata.category(' ')
> 'Zs
On Thu, 11 Jun 2015 10:09 am, Chris Angelico wrote:
> On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano
> wrote:
>> (Oh, and for the record, there are at least two non-breaking spaces in
>> Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".)
>>
>> http://www.unicode.org/charts/PD
On Thu, Jun 11, 2015 at 11:02 AM, wrote:
>
> On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote:
> > And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as
> > the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've
> > been
> > fighting with VLC Media Player ov
On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote:
> And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as
> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've
> been
> fighting with VLC Media Player over the font it uses for subtitles; for
> some bizarre
On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano
wrote:
> (Oh, and for the record, there are at least two non-breaking spaces in
> Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".)
>
> http://www.unicode.org/charts/PDF/U0080.pdf
> http://www.unicode.org/charts/PDF/U2000.pdf
An
On Thu, 11 Jun 2015 12:28 am, Skip Montanaro wrote:
> On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
> wrote:
>> Is this a bug?
>
> Looks like it's been reported a few times with slightly different context:
>
> https://bugs.python.org/issue6537
> https://bugs.python.org/issue16623
> https://bugs.py
On Wed, Jun 10, 2015, at 11:03, Laura Creighton wrote:
> In these unicode days, this thinking may need to be revisited. There
> are many languages where whitespace does not separate words -- either
> words aren't separated, or in Vietnamese, spaces separate syllables,
> so entire words have spaces
In a message of Wed, 10 Jun 2015 09:28:24 -0500, Skip Montanaro writes:
>On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
> wrote:
>> Is this a bug?
>
>Looks like it's been reported a few times with slightly different context:
>
>https://bugs.python.org/issue6537
>https://bugs.python.org/issue16623
>http
On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
wrote:
> Is this a bug?
Looks like it's been reported a few times with slightly different context:
https://bugs.python.org/issue6537
https://bugs.python.org/issue16623
https://bugs.python.org/issue20491
https://bugs.python.org/issue1390608
The couple t
On 10/06/2015 14:28, Tim Chase wrote:
str.split() doesn't seem to respect non-breaking space:
Python 3.4.2 (default, Oct 8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print(repr("hello\N{NO-BREAK SPACE}world".split(
str.split() doesn't seem to respect non-breaking space:
Python 3.4.2 (default, Oct 8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print(repr("hello\N{NO-BREAK SPACE}world".split()))
['hello', 'world']
What's the purpos
15 matches
Mail list logo