Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-11 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 15:22:02 -0400 "Theodore Ts'o" escreveu: > On Mon, May 10, 2021 at 02:49:44PM +0100, David Woodhouse wrote: > > On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote: > > > This patch series is doing conversion only when using ASCII makes > > > more sense than using

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-11 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 14:49:44 +0100 David Woodhouse escreveu: > On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote: > > This patch series is doing conversion only when using ASCII makes > > more sense than using UTF-8. > > > > See, a number of converted documents ended with weird cha

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-11 Thread David Woodhouse
On Tue, 2021-05-11 at 11:00 +0200, Mauro Carvalho Chehab wrote: > Yet, this series has two positive side effects: > > - it helps people needing to touch the documents using non-utf8 locales[1]; > - it makes easier to grep for a text; > > [1] There are still some widely used distros nowadays (LT

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-11 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 15:33:47 +0100 Edward Cree escreveu: > On 10/05/2021 14:59, Matthew Wilcox wrote: > > Most of these > > UTF-8 characters come from latex conversions and really aren't > > necessary (and are being used incorrectly). > I fully agree with fixing those. > The cover-letter, howev

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Adam Borowski
On Mon, May 10, 2021 at 12:26:12PM +0200, Mauro Carvalho Chehab wrote: > There are several UTF-8 characters at the Kernel's documentation. [...] > Other UTF-8 characters were added along the time, but they're easily > replaceable by ASCII chars. > > As Linux developers are all around the globe, an

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Theodore Ts'o
On Mon, May 10, 2021 at 02:49:44PM +0100, David Woodhouse wrote: > On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote: > > This patch series is doing conversion only when using ASCII makes > > more sense than using UTF-8. > > > > See, a number of converted documents ended with weird c

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 14:38, Mauro Carvalho Chehab wrote: > Em Mon, 10 May 2021 14:16:16 +0100 > Edward Cree escreveu: >> But what kinds of things with × or — in are going to be grept for? > > Actually, on almost all places, those aren't used inside math formulae, but > instead, they describe video some

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 14:59, Matthew Wilcox wrote: > Most of these > UTF-8 characters come from latex conversions and really aren't > necessary (and are being used incorrectly). I fully agree with fixing those. The cover-letter, however, gave the impression that that was not the main purpose of this serie

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Ben Boeckel
On Mon, May 10, 2021 at 13:55:18 +0200, Mauro Carvalho Chehab wrote: > $ git grep "CPU 0 has been" Documentation/RCU/ > Documentation/RCU/Design/Data-Structures/Data-Structures.rst:| #. CPU 0 > has been in dyntick-idle mode for quite some time. When it | > Documentation/RCU/Desig

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Edward Cree
On 10/05/2021 12:55, Mauro Carvalho Chehab wrote: > The main point on this series is to replace just the occurrences > where ASCII represents the symbol equally well > - U+2014 ('—'): EM DASH Em dash is not the same thing as hyphen-minus, and the latter does not serve 'equally well'. Peopl

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Thorsten Leemhuis
On 10.05.21 12:26, Mauro Carvalho Chehab wrote: > > As Linux developers are all around the globe, and not everybody has UTF-8 > as their default charset, better to use UTF-8 only on cases where it is really > needed. > […] > The remaining patches on series address such cases on *.rst files and >

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Matthew Wilcox
On Mon, May 10, 2021 at 02:16:16PM +0100, Edward Cree wrote: > On 10/05/2021 12:55, Mauro Carvalho Chehab wrote: > > The main point on this series is to replace just the occurrences > > where ASCII represents the symbol equally well > > > - U+2014 ('—'): EM DASH > Em dash is not the same thing

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread David Woodhouse
On Mon, 2021-05-10 at 13:55 +0200, Mauro Carvalho Chehab wrote: > This patch series is doing conversion only when using ASCII makes > more sense than using UTF-8. > > See, a number of converted documents ended with weird characters > like ZERO WIDTH NO-BREAK SPACE (U+FEFF) character. This specifi

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 14:16:16 +0100 Edward Cree escreveu: > On 10/05/2021 12:55, Mauro Carvalho Chehab wrote: > > The main point on this series is to replace just the occurrences > > where ASCII represents the symbol equally well > > > - U+2014 ('—'): EM DASH > Em dash is not the same thi

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 13:19:50 +0200 Mauro Carvalho Chehab escreveu: > Em Mon, 10 May 2021 12:52:44 +0200 > Thorsten Leemhuis escreveu: > > > On 10.05.21 12:26, Mauro Carvalho Chehab wrote: > > > > > > As Linux developers are all around the globe, and not everybody has UTF-8 > > > as their defa

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Mauro Carvalho Chehab
Hi David, Em Mon, 10 May 2021 11:54:02 +0100 David Woodhouse escreveu: > On Mon, 2021-05-10 at 12:26 +0200, Mauro Carvalho Chehab wrote: > > There are several UTF-8 characters at the Kernel's documentation. > > > > Several of them were due to the process of converting files from > > DocBook, La

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Mauro Carvalho Chehab
Em Mon, 10 May 2021 12:52:44 +0200 Thorsten Leemhuis escreveu: > On 10.05.21 12:26, Mauro Carvalho Chehab wrote: > > > > As Linux developers are all around the globe, and not everybody has UTF-8 > > as their default charset, better to use UTF-8 only on cases where it is > > really > > needed. >

Re: [PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread David Woodhouse
On Mon, 2021-05-10 at 12:26 +0200, Mauro Carvalho Chehab wrote: > There are several UTF-8 characters at the Kernel's documentation. > > Several of them were due to the process of converting files from > DocBook, LaTeX, HTML and Markdown. They were probably introduced > by the conversion tools used

[PATCH 00/53] Get rid of UTF-8 chars that can be mapped as ASCII

2021-05-10 Thread Mauro Carvalho Chehab
There are several UTF-8 characters at the Kernel's documentation. Several of them were due to the process of converting files from DocBook, LaTeX, HTML and Markdown. They were probably introduced by the conversion tools used on that time. Other UTF-8 characters were added along the time, but they