Hi,

based on the discussion at:
https://lists.gnu.org/archive/html/coreutils/2017-08/msg00029.html

I implemented cut functionality with multibyte delimiter (cut -d'\unicode'
-f ) support using string (char*) and adding another function, to avoid
compatibility issues with "wchar_t".

I have not used any sofisticated error checking concerning the delimiter
value, as giving a wrong value leads to leaving the input "as-is" to the
output. Only checking if there are not multiple delimiters as well as
checking if the current locale is utf8.

I decided to accept only utf8 locales so far, when dealing with multibyte
delimiter as I agree with Assaf and Pádraig, that having utf8 support is a
better option than having the current state.

I added some tests by adding modified tests from 'cut.pl'.

I will be thankful for any feedback.

Enjoy the week!
Sebastián.

Attachment: cut-mb-delim-support.tar.gz
Description: GNU Zip compressed data

Reply via email to