[ 
https://issues.apache.org/jira/browse/SQOOP-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Xu updated SQOOP-1988:
---------------------------
    Description: 
The {{Matcher.getMatchingData}} method is expected to rearrange record fields 
according to the FROM and TO schema. Currently here is an extra step in the 
implementation, which will reset any {{null}} {{"NULL"}} {{"null"}} 
{{"'null'"}} or {{""}} field to null. 

As there is no comment or documentation about this, I'm thinking it is some 
undocumented special handling. (
https://issues.apache.org/jira/browse/SQOOP-1811?focusedCommentId=14270755&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14270755
 Read this)

There are two questions here:
1. They are CSV relevant. As we do not have a knowledge of what intermediate 
data format is used, why we do the check and convert here. 
2. {{"'null'"}} (double quote and single quote), it is a valid string 
represents single quote null single quote. It will be converted to null. Is 
this expected?

I'd propose to remove it. And it will have performance advance. Thanks 
[~jerrychenhf]

  was:
The {{Matcher.getMatchingData}} method is expected to rearrange record fields 
according to the FROM and TO schema. Currently here is an extra step in the 
implementation, which will reset any {{null}} {{"NULL"}} {{"null"}} 
{{"'null'"}} or {{""}} field to null. 

As there is no comment or documentation about this, I'm thinking it is some 
undocumented special handling. [
https://issues.apache.org/jira/browse/SQOOP-1811?focusedCommentId=14270755&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14270755
 Read this]

There are two questions here:
1. They are CSV relevant. As we do not have a knowledge of what intermediate 
data format is used, why we do the check and convert here. 
2. {{"'null'"}} (double quote and single quote), it is a valid string 
represents single quote null single quote. It will be converted to null. Is 
this expected?

I'd propose to remove it. And it will have performance advance. Thanks 
[~jerrychenhf]


> Sqoop2: isNull handling should be moved to CSVIntermediateDataFormat
> --------------------------------------------------------------------
>
>                 Key: SQOOP-1988
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1988
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>             Fix For: 2.0.0
>
>
> The {{Matcher.getMatchingData}} method is expected to rearrange record fields 
> according to the FROM and TO schema. Currently here is an extra step in the 
> implementation, which will reset any {{null}} {{"NULL"}} {{"null"}} 
> {{"'null'"}} or {{""}} field to null. 
> As there is no comment or documentation about this, I'm thinking it is some 
> undocumented special handling. (
> https://issues.apache.org/jira/browse/SQOOP-1811?focusedCommentId=14270755&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14270755
>  Read this)
> There are two questions here:
> 1. They are CSV relevant. As we do not have a knowledge of what intermediate 
> data format is used, why we do the check and convert here. 
> 2. {{"'null'"}} (double quote and single quote), it is a valid string 
> represents single quote null single quote. It will be converted to null. Is 
> this expected?
> I'd propose to remove it. And it will have performance advance. Thanks 
> [~jerrychenhf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to