cut with multibyte support for delimiter

Sebastian Kisela Mon, 18 Sep 2017 07:26:49 -0700

Hi,

based on the discussion at:
https://lists.gnu.org/archive/html/coreutils/2017-08/msg00029.html


I implemented cut functionality with multibyte delimiter (cut -d'\unicode'
-f ) support using string (char*) and adding another function, to avoid
compatibility issues with "wchar_t".

I have not used any sofisticated error checking concerning the delimiter
value, as giving a wrong value leads to leaving the input "as-is" to the
output. Only checking if there are not multiple delimiters as well as
checking if the current locale is utf8.

I decided to accept only utf8 locales so far, when dealing with multibyte
delimiter as I agree with Assaf and Pádraig, that having utf8 support is a
better option than having the current state.

I added some tests by adding modified tests from 'cut.pl'.

I will be thankful for any feedback.

Enjoy the week!
Sebastián.

cut-mb-delim-support.tar.gz
Description: GNU Zip compressed data

cut with multibyte support for delimiter

Reply via email to