Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Eric Fischer
Or actually I *won't* necessarily have to change my version of tr, because the real point of this thread isn't to get my own changes accepted, it's to get *some* reasonable multibyte implementation of the utilities, regardless of whose it is, into the standard coreutils distribution. Eric

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Eric Fischer
OK, that seems reasonable, since as far as I know, no one implements the POSIX notation for constructing multibyte characters out of adjacent octal escapes anyway, and the standard has already backed off from supporting them in ranges. I'll have to change mine to leave characters decomposed until

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Assaf Gordon
Hello, On Wed, Jan 17, 2018 at 02:53:21PM -0800, Eric Fischer wrote: > * My tr will not remove bytes from the middle of characters > [...] > is arguably an error in the test, because POSIX specifies that octal > escapes represent characters, not bytes. Please see previous discussion here:

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Eric Fischer
I am now tracking which of Assaf's tests my implementation passes and fails in https://github.com/ericfischer/coreutils/issues/2. The ones that fail seem to be because: * I have not implemented cut -n * My tr will not remove bytes from the middle of characters * Linux and MacOS disagree about

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Eric Fischer
Thanks for the feedback. To clear one thing up at the start: I am not Eric Blake, so the earlier cut -d patch is not mine. Thanks also for clarifying the license requirements. I will follow up with Mapbox legal to find out how we can work with this. Sebastian, I think you may have been testing

Re: Why cut treats one column input differently for out-of-range field spec?

2018-01-17 Thread Pádraig Brady
On 17/01/18 06:16, Peng Yu wrote: > Hi, > > If there is only one column in the input, then an out-of-range field > spec will result in the print of the whole line. > > $ cut -f 3 <<< $'a' | xxd > 000: 610a a. > > Otherwise, an empty string is printed. >

Why cut treats one column input differently for out-of-range field spec?

2018-01-17 Thread Peng Yu
Hi, If there is only one column in the input, then an out-of-range field spec will result in the print of the whole line. $ cut -f 3 <<< $'a' | xxd 000: 610a a. Otherwise, an empty string is printed. $ cut -f 3 <<< $'a\tb' | xxd 000: 0a

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Assaf Gordon
Hello, On 2018-01-17 12:45 AM, Sebastian Kisela wrote: I have checked the Eric's effort on the multibyte support for coreutils. The work done seems solid. Thank you for pitching in to the multibyte effort! (and your previous patch for "cut -d" is on my TODO list, I haven't forgotten it).

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr

2018-01-17 Thread Assaf Gordon
Hello, On 2018-01-10 01:20 PM, Eric Fischer wrote: I have requested and received the copyright assignment paperwork, Thank you for doing that. but my > employer would like to dedicate my changes to the public domain or as CC0 rather than assign or disclaim copyright. Would this be