[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188652#comment-15188652
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--------------------------------------

Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOL_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-2561
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2561
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>         Environment: cdh5.3.2
>            Reporter: Suresh
>            Assignee: VISHNU S NAIR
>              Labels: AVRO, SQOOP
>             Fix For: 1.4.7
>
>         Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to