Okay, FWIW I found the solution: https://issues.apache.org/jira/browse/MAPREDUCE-6085
Thanks for all who replied. On 11 сент. 2014 г., at 11:16, Dmitry Sivachenko <trtrmi...@gmail.com> wrote: > After streaming job outputs some data to stdout, some hadoop code receives it > and splits into key/value pair before it reaches TextOutputFormat. > Can anyone point me to that piece of code please? > > Thanks! > > On 11 сент. 2014 г., at 0:37, Dmitry Sivachenko <trtrmi...@gmail.com> wrote: > >> >> On 10 сент. 2014 г., at 22:33, Felix Chern <idry...@gmail.com> wrote: >> >>> Use ‘tr -s’ to stripe out tabs? >>> >>> $ echo -e "a\t\t\tb" >>> a b >>> >>> $ echo -e "a\t\t\tb" | tr -s "\t" >>> a b >>> >> >> There can be tabs in the input, I want to keep input lines without any >> modification. >> >> Actually it is rather standard task: process lines one by one without >> inserting extra characters. There should be standard solution for it IMO. >> >