[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430986#comment-15430986
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---------------------------------------

Github user kevin00chen commented on a diff in the pull request:

    https://github.com/apache/sqoop/pull/26#discussion_r75698508
  
    --- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
    @@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
         }
         Object keyObj = null;
         if (keyColName.contains(",")) {
    +        String connectStr = new String(new byte[]{1});
             StringBuilder keyFieldsSb = new StringBuilder();
             for (String str : keyColName.split(",")) {
    -            keyFieldsSb.append("+").append(fieldMap.get(str).toString());
    +            
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
    --- End diff --
    
    for example one table has two column, a and b
    
    Field a | Field b
    ------------ | -------------
    a+ | b
    a | +b
    
    when use "+" to connect two field, two record will has same keyObj.
    To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> --------------------------------------------
>
>                 Key: SQOOP-3002
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3002
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: hive-integration
>    Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>            Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_000004_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>       at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>       at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>       at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to