[
https://issues.apache.org/jira/browse/SQOOP-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273601#comment-14273601
]
Jarek Jarcec Cecho commented on SQOOP-1988:
-------------------------------------------
Thank you for bringing this one up [~stanleyxu2005]. I was asking myself the
same question when I was reviewing some of the recent patches, but I didn't
have time to dig into it a bit more.
I believe that the schema matcher should do only a schema matching and should
not do any data conversions. Hence I also believe that this code should be
removed from schema matcher. I think that the original purpose was to allow
arbitrary CSV support. I think that we had a JIRA to cover the custom {{NULL}}
representations, but I'm having difficulties to look it up. As we are insisting
on using a constant {{NULL}} in our CSV IDF I would assume that we should
simply drop the code in question. It should be job of the IDF to convert any
incoming values to proper {{NULL}} objects if they want to support multiple
{{NULL}} representations. What do you think?
> Sqoop2: isNull handling should be moved to CSVIntermediateDataFormat
> --------------------------------------------------------------------
>
> Key: SQOOP-1988
> URL: https://issues.apache.org/jira/browse/SQOOP-1988
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Qian Xu
> Assignee: Qian Xu
> Fix For: 2.0.0
>
>
> The {{Matcher.getMatchingData}} method is expected to rearrange record fields
> according to the FROM and TO schema. Currently here is an extra step in the
> implementation, which will reset any {{null}} {{"NULL"}} {{"null"}}
> {{"'null'"}} or {{""}} field to null.
> As there is no comment or documentation about this, I guess it is some
> undocumented special handling. [Here is some
> discussion|https://issues.apache.org/jira/browse/SQOOP-1811?focusedCommentId=14270755&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14270755].
> I think this check should not belong here. I propose to remove it. As the
> method will be called very frequently, the code removal will have performance
> advance. Thanks [~jerrychenhf]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)