^A for quote, ^B for comma .. and so on.

-- amr

Mark Kerzner wrote:
Thanks again, Todd. I need two delimiters, one for comma and one for quote.
But I guess I can use ^A for quote, and keep the comma as is, and I will be
good.
Sincerely,
Mark

On Mon, Oct 12, 2009 at 10:15 PM, Todd Lipcon <[email protected]> wrote:

Hey Mark,

The most commonly used delimiter for cases like this is ^A (character 1)

-Todd

On Mon, Oct 12, 2009 at 7:56 PM, Mark Kerzner <[email protected]>
wrote:

Thanks, that is a great answer.
My problem is that the application that reads my output accepts a
comma-separated file with extended ASCII delimiters. Following your
answer,
however, I will try to use low-value ASCII, like 9 or 11, unless someone
has
a better suggestion.

Thank you,
Mark

On Fri, Oct 9, 2009 at 6:49 PM, Todd Lipcon <[email protected]> wrote:

Hi Mark,

If you're using TextOutputFormat, it assumes you're dealing in UTF8.
Decimal
254 wouldn't be valid as a standalone character in UTF8 encoding.

If you're dealing with binary (ie non-textual) data, you shouldn't use
TextOutputFormat.

-Todd

On Fri, Oct 9, 2009 at 3:09 PM, Mark Kerzner <[email protected]>
wrote:

Hi,
the strings I am writing in my reducer have characters that may
present
a
problem, such as char represented by decimal 254, which is hex FE. It
seems
that instead I see hex C3, or something else is messed up. Or my
understanding is messed up :)

Any advice?

Thank you,
Mark


Reply via email to