Thanks again, Todd. I need two delimiters, one for comma and one for quote. But I guess I can use ^A for quote, and keep the comma as is, and I will be good. Sincerely, Mark
On Mon, Oct 12, 2009 at 10:15 PM, Todd Lipcon <[email protected]> wrote: > Hey Mark, > > The most commonly used delimiter for cases like this is ^A (character 1) > > -Todd > > On Mon, Oct 12, 2009 at 7:56 PM, Mark Kerzner <[email protected]> > wrote: > > > Thanks, that is a great answer. > > My problem is that the application that reads my output accepts a > > comma-separated file with extended ASCII delimiters. Following your > answer, > > however, I will try to use low-value ASCII, like 9 or 11, unless someone > > has > > a better suggestion. > > > > Thank you, > > Mark > > > > On Fri, Oct 9, 2009 at 6:49 PM, Todd Lipcon <[email protected]> wrote: > > > > > Hi Mark, > > > > > > If you're using TextOutputFormat, it assumes you're dealing in UTF8. > > > Decimal > > > 254 wouldn't be valid as a standalone character in UTF8 encoding. > > > > > > If you're dealing with binary (ie non-textual) data, you shouldn't use > > > TextOutputFormat. > > > > > > -Todd > > > > > > On Fri, Oct 9, 2009 at 3:09 PM, Mark Kerzner <[email protected]> > > > wrote: > > > > > > > Hi, > > > > the strings I am writing in my reducer have characters that may > present > > a > > > > problem, such as char represented by decimal 254, which is hex FE. It > > > seems > > > > that instead I see hex C3, or something else is messed up. Or my > > > > understanding is messed up :) > > > > > > > > Any advice? > > > > > > > > Thank you, > > > > Mark > > > > > > > > > >
