Seeing as I'm now depending on this behavior, I nominate that that bug be upgraded to feature :-)
-Mark On Fri, Oct 21, 2011 at 1:38 PM, arv...@cloudera.com <arv...@cloudera.com> wrote: > Glad it worked Mark! >> And it looks like you don't have to do a hive import to use it. > That sounds like a bug to me :) > Arvind > > On Fri, Oct 21, 2011 at 9:41 AM, Mark Roddy <markro...@gmail.com> wrote: >> >> Thanks for the help Arvind. The hive-drop-import-delims worked. And >> it looks like you don't have to do a hive import to use it. >> >> -Mark >> >> >> On Fri, Oct 21, 2011 at 11:43 AM, Arvind Prabhakar <arv...@apache.org> >> wrote: >> > One work around worth trying is to use the "--hive-drop-import-delims" >> > option and do a hive import. With this option set, Sqoop will remove >> > any new lines or ^A characters which are the default delimiters used >> > for Hive. After the import is done, you could copy the file out of >> > Hive directly and use it in your application. >> > >> > Arvind >> > >> > On Fri, Oct 21, 2011 at 7:05 AM, Mark Roddy <markro...@gmail.com> wrote: >> >> I used "--escaped-by \\" due to bash, so that "\" would be the escape >> >> character used. That works fine, I end up with \n and \t characters >> >> escaped by '\'. >> >> >> >> >> >> To put the problem more concretely, I have a singe record from the db >> >> with a field containing the following value: >> >> "foo >> >> bar baz >> >> biz" >> >> >> >> Sqoop will spit out: >> >> "foo\ >> >> bar baz\ >> >> biz" >> >> >> >> >> >> No if I run a map reduce job on this with the TextInputFormat, the >> >> record will be terminated after "foo" not after "biz". I did a little >> >> digging and TextInputFormat uses LineRecordReader, which uses >> >> LineReader which looking at the source, clearly does not honor the >> >> escape char. Is there a tool/input format/etc that will read from >> >> HDFS and honor this? It does not seem that M/R can do it out of the >> >> box. I can't find a way to get Pig. I assume there must be something >> >> that will honor the escape, but can not find anything. >> >> >> >> >> >> >> >> On Fri, Oct 21, 2011 at 5:26 AM, Alexander C.H. Lorenz >> >> <wget.n...@googlemail.com> wrote: >> >>> Hi Mark, >> >>> --escaped-by \/ (backslash - slash) tells bash to escape the next >> >>> character. >> >>> (if I understood you right) >> >>> - Alex >> >>> On Fri, Oct 21, 2011 at 12:12 AM, Mark Roddy <markro...@gmail.com> >> >>> wrote: >> >>>> >> >>>> I'm moving free form data out of a RDBMS that has a lot of \n, \r\n, >> >>>> and \t characters. >> >>>> >> >>>> I used "--escaped-by \\" (extra \ cause of bash), but I'm a little >> >>>> confused about what to do with this data now. I can't seem to find >> >>>> any tools that will honor the '\' escape char. TextInputFormat does >> >>>> not seem to. >> >>>> >> >>>> I'm working on replacing an existing in house tool w/sqoop that >> >>>> replace newlines with the literal string '\n'. I'd be happy to do as >> >>>> such but I don't see any way of doing so. >> >>>> >> >>>> I'm sure I'm not the first person to run into this so I appreciate >> >>>> any >> >>>> suggestions. >> >>>> >> >>>> -Mark >> >>> >> >>> >> >>> >> >>> -- >> >>> Alexander Lorenz >> >>> http://mapredit.blogspot.com >> >>> >> >>> >> >> >> > > >