Hey Mark,

The most commonly used delimiter for cases like this is ^A (character 1)

-Todd

On Mon, Oct 12, 2009 at 7:56 PM, Mark Kerzner <[email protected]> wrote:

> Thanks, that is a great answer.
> My problem is that the application that reads my output accepts a
> comma-separated file with extended ASCII delimiters. Following your answer,
> however, I will try to use low-value ASCII, like 9 or 11, unless someone
> has
> a better suggestion.
>
> Thank you,
> Mark
>
> On Fri, Oct 9, 2009 at 6:49 PM, Todd Lipcon <[email protected]> wrote:
>
> > Hi Mark,
> >
> > If you're using TextOutputFormat, it assumes you're dealing in UTF8.
> > Decimal
> > 254 wouldn't be valid as a standalone character in UTF8 encoding.
> >
> > If you're dealing with binary (ie non-textual) data, you shouldn't use
> > TextOutputFormat.
> >
> > -Todd
> >
> > On Fri, Oct 9, 2009 at 3:09 PM, Mark Kerzner <[email protected]>
> > wrote:
> >
> > > Hi,
> > > the strings I am writing in my reducer have characters that may present
> a
> > > problem, such as char represented by decimal 254, which is hex FE. It
> > seems
> > > that instead I see hex C3, or something else is messed up. Or my
> > > understanding is messed up :)
> > >
> > > Any advice?
> > >
> > > Thank you,
> > > Mark
> > >
> >
>

Reply via email to