Thanks for the help Arvind.  The hive-drop-import-delims worked.  And
it looks like you don't have to do a hive import to use it.

-Mark


On Fri, Oct 21, 2011 at 11:43 AM, Arvind Prabhakar <arv...@apache.org> wrote:
> One work around worth trying is to use the "--hive-drop-import-delims"
> option and do a hive import. With this option set, Sqoop will remove
> any new lines or ^A characters which are the default delimiters used
> for Hive. After the import is done, you could copy the file out of
> Hive directly and use it in your application.
>
> Arvind
>
> On Fri, Oct 21, 2011 at 7:05 AM, Mark Roddy <markro...@gmail.com> wrote:
>> I used "--escaped-by \\" due to bash, so that "\" would be the escape
>> character used.  That works fine, I end up with \n and \t characters
>> escaped by '\'.
>>
>>
>> To put the problem more concretely, I have a singe record from the db
>> with a field containing the following value:
>> "foo
>> bar baz
>> biz"
>>
>> Sqoop will spit out:
>> "foo\
>> bar baz\
>> biz"
>>
>>
>> No if I run a map reduce job on this with the TextInputFormat, the
>> record will be terminated after "foo" not after "biz".  I did a little
>> digging and TextInputFormat uses LineRecordReader, which uses
>> LineReader which looking at the source, clearly does not honor the
>> escape char.  Is there a tool/input format/etc that will read from
>> HDFS and honor this?  It does not seem that M/R can do it out of the
>> box.  I can't find a way to get Pig.  I assume there must be something
>> that will honor the escape, but can not find anything.
>>
>>
>>
>> On Fri, Oct 21, 2011 at 5:26 AM, Alexander C.H. Lorenz
>> <wget.n...@googlemail.com> wrote:
>>> Hi Mark,
>>> --escaped-by \/ (backslash - slash) tells bash to escape the next character.
>>> (if I understood you right)
>>> - Alex
>>> On Fri, Oct 21, 2011 at 12:12 AM, Mark Roddy <markro...@gmail.com> wrote:
>>>>
>>>> I'm moving free form data out of a RDBMS that has a lot of \n, \r\n,
>>>> and \t characters.
>>>>
>>>> I used "--escaped-by \\" (extra \ cause of bash), but I'm a little
>>>> confused about what to do with this data now.  I can't seem to find
>>>> any tools that will honor the '\' escape char.  TextInputFormat does
>>>> not seem to.
>>>>
>>>> I'm working on replacing an existing in house tool w/sqoop that
>>>> replace newlines with the literal string '\n'.  I'd be happy to do as
>>>> such but I don't see any way of doing so.
>>>>
>>>> I'm sure I'm not the first person to run into this so I appreciate any
>>>> suggestions.
>>>>
>>>> -Mark
>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>>
>>
>

Reply via email to