Hey Mark, The most commonly used delimiter for cases like this is ^A (character 1)
-Todd On Mon, Oct 12, 2009 at 7:56 PM, Mark Kerzner <[email protected]> wrote: > Thanks, that is a great answer. > My problem is that the application that reads my output accepts a > comma-separated file with extended ASCII delimiters. Following your answer, > however, I will try to use low-value ASCII, like 9 or 11, unless someone > has > a better suggestion. > > Thank you, > Mark > > On Fri, Oct 9, 2009 at 6:49 PM, Todd Lipcon <[email protected]> wrote: > > > Hi Mark, > > > > If you're using TextOutputFormat, it assumes you're dealing in UTF8. > > Decimal > > 254 wouldn't be valid as a standalone character in UTF8 encoding. > > > > If you're dealing with binary (ie non-textual) data, you shouldn't use > > TextOutputFormat. > > > > -Todd > > > > On Fri, Oct 9, 2009 at 3:09 PM, Mark Kerzner <[email protected]> > > wrote: > > > > > Hi, > > > the strings I am writing in my reducer have characters that may present > a > > > problem, such as char represented by decimal 254, which is hex FE. It > > seems > > > that instead I see hex C3, or something else is messed up. Or my > > > understanding is messed up :) > > > > > > Any advice? > > > > > > Thank you, > > > Mark > > > > > >
