[jira] [Commented] (SQOOP-2312) Problem when exporting files that has \n as part as the content columns

Younos Aboulnaga (JIRA) Thu, 21 May 2015 20:44:53 -0700

    [ 
https://issues.apache.org/jira/browse/SQOOP-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555526#comment-14555526
 ]


Younos Aboulnaga commented on SQOOP-2312:
-----------------------------------------

This problem also happens with all other vertical space characters, such as 
form feed, vertical space, ... etc.

I am not sure if this is addressed in Sqoop2, especially that in the 
CSVIntermediateFormat Wiki page 
(https://cwiki.apache.org/confluence/display/SQOOP/Intermediate+Data+Format+API)
 the only vertical space characters mentioned are \n and \r.

> Problem when exporting files that has \n as part as the content columns
> -----------------------------------------------------------------------
>
>                 Key: SQOOP-2312
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2312
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/generic
>         Environment: Sqoop 1.4.6-rc1
>            Reporter: Henrique Andrade
>            Priority: Critical
>
> I have exported from my SQL Server some data related to our customers.
> One of the columns has some comments from customers and this is the data that 
> is there:
> "Pecém\n" +
>                         "                                \n" +
>                         "                                                     
>         (São Gonçalo do Amarante)
> The problem is that Sqoop is breaking the Record at this point and the rest 
> of the process is failing.
> I tried to use some different options such as lines-terminated by with 
> different character (ˆ) but looks like hadoop library is not accepting that 
> and is taking all the 29.000 records as a single record.
>    "--fields-terminated-by", "|",
>                 "--lines-terminated-by", "ˆ",
>                 "--enclosed-by","'",
>                 "--escaped-by","\\"};
> I have read in some threads that looks like the only lines-terminated-by 
> character that was accepted was \n. Is this changed on this 1.4.6 version?
> Is there a way for avoiding the content of the columns to break the import?
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2312) Problem when exporting files that has \n as part as the content columns

Reply via email to