*** This bug is a duplicate of bug 875713 ***
    https://bugs.launchpad.net/bugs/875713

** This bug has been marked a duplicate of bug 875713
   cut fails to handle correctly utf-8

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to coreutils in Ubuntu.
https://bugs.launchpad.net/bugs/91175

Title:
  cut gets confused with UTF-8 characters

Status in coreutils package in Ubuntu:
  Triaged

Bug description:
  Binary package hint: coreutils

  GNU cut gets confused about character boundaries with UTF-8 encoded
  files.

  An example, as they (almost) say, is worth a thousand words:

  nslater@hinata: ~ $ locale
  LANG=en_US.UTF-8
  LC_CTYPE="en_US.UTF-8"
  LC_NUMERIC="en_US.UTF-8"
  LC_TIME="en_US.UTF-8"
  LC_COLLATE="en_US.UTF-8"
  LC_MONETARY="en_US.UTF-8"
  LC_MESSAGES="en_US.UTF-8"
  LC_PAPER="en_US.UTF-8"
  LC_NAME="en_US.UTF-8"
  LC_ADDRESS="en_US.UTF-8"
  LC_TELEPHONE="en_US.UTF-8"
  LC_MEASUREMENT="en_US.UTF-8"
  LC_IDENTIFICATION="en_US.UTF-8"
  LC_ALL=
  nslater@hinata: ~ $ cat foo.txt
  She said “I think I found a bug.”
  nslater@hinata: ~ $ cat foo.txt | cut --characters 10-
  “I think I found a bug.”
  nslater@hinata: ~ $ cat foo.txt | cut --characters 11-
  ��I think I found a bug.”
  nslater@hinata: ~ $ cat foo.txt | cut --characters 12-
  �I think I found a bug.”
  nslater@hinata: ~ $ cat foo.txt | cut --characters 13-
  I think I found a bug.”

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/91175/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to