[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key
[ https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521446#comment-15521446 ] ASF GitHub Bot commented on SQOOP-3002: --- Github user liz-z17 commented on the issue: https://github.com/apache/sqoop/pull/26 Hi @kevin00chen , Sqoop project currently does not accept pull requests. To contribute, you will need to generate patch and upload it to the JIRA (of course if there's no corresponding JIRA issue, you will also need to create one first). See the instructions here: https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute If you need more directions, I'd be happy to help! > Sqoop Merge Tool support composite merge-key > > > Key: SQOOP-3002 > URL: https://issues.apache.org/jira/browse/SQOOP-3002 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7 >Reporter: KaimingChen > > When i use sqoop merge tool, i can just specify one column using --merge-key > arguement. > But when my table has composite keys, i use --merge-key column1,column2 then > i got an Exception: > 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : > attempt_1470135750174_2508_m_04_2, Status : FAILED > Error: java.io.IOException: Cannot join values on null key. Did you specify a > key column that exists? > at > org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key
[ https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430986#comment-15430986 ] ASF GitHub Bot commented on SQOOP-3002: --- Github user kevin00chen commented on a diff in the pull request: https://github.com/apache/sqoop/pull/26#discussion_r75698508 --- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java --- @@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c) } Object keyObj = null; if (keyColName.contains(",")) { +String connectStr = new String(new byte[]{1}); StringBuilder keyFieldsSb = new StringBuilder(); for (String str : keyColName.split(",")) { -keyFieldsSb.append("+").append(fieldMap.get(str).toString()); + keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString()); --- End diff -- for example one table has two column, a and b Field a | Field b | - a+ | b a | +b when use "+" to connect two field, two record will has same keyObj. To avoid this i use a String contains one byte. > Sqoop Merge Tool support composite merge-key > > > Key: SQOOP-3002 > URL: https://issues.apache.org/jira/browse/SQOOP-3002 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7 >Reporter: KaimingChen > > When i use sqoop merge tool, i can just specify one column using --merge-key > arguement. > But when my table has composite keys, i use --merge-key column1,column2 then > i got an Exception: > 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : > attempt_1470135750174_2508_m_04_2, Status : FAILED > Error: java.io.IOException: Cannot join values on null key. Did you specify a > key column that exists? > at > org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key
[ https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430980#comment-15430980 ] ASF GitHub Bot commented on SQOOP-3002: --- Github user kevin00chen commented on a diff in the pull request: https://github.com/apache/sqoop/pull/26#discussion_r75698166 --- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java --- @@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c) } Object keyObj = null; if (keyColName.contains(",")) { +String connectStr = new String(new byte[]{1}); StringBuilder keyFieldsSb = new StringBuilder(); for (String str : keyColName.split(",")) { -keyFieldsSb.append("+").append(fieldMap.get(str).toString()); + keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString()); --- End diff -- for example one table has two column, a and b Field a | Field b | - a+ | b a | +b when use "+" to connect two field, two record will has same keyObj. To avoid this i use a String contains one byte. > Sqoop Merge Tool support composite merge-key > > > Key: SQOOP-3002 > URL: https://issues.apache.org/jira/browse/SQOOP-3002 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7 >Reporter: KaimingChen > > When i use sqoop merge tool, i can just specify one column using --merge-key > arguement. > But when my table has composite keys, i use --merge-key column1,column2 then > i got an Exception: > 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : > attempt_1470135750174_2508_m_04_2, Status : FAILED > Error: java.io.IOException: Cannot join values on null key. Did you specify a > key column that exists? > at > org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key
[ https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430912#comment-15430912 ] ASF GitHub Bot commented on SQOOP-3002: --- GitHub user kevin00chen opened a pull request: https://github.com/apache/sqoop/pull/26 [SQOOP-3002] sqoop merge tool composite merge-key JIRA Issue:https://issues.apache.org/jira/browse/SQOOP-3002 Sqoop Merge Tool can just specify one column by using --merge-key argument. When i need to specify two or more column, i need to modify some source code You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevin00chen/sqoop my_change Alternatively you can review and apply these changes as the patch at: https://github.com/apache/sqoop/pull/26.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #26 commit cd1e840c8dfb6261aa3be81b9c4881e80bc038bd Author: KaimingChenDate: 2016-08-22T14:34:45Z sqoop merge tool composite merge-key > Sqoop Merge Tool support composite merge-key > > > Key: SQOOP-3002 > URL: https://issues.apache.org/jira/browse/SQOOP-3002 > Project: Sqoop > Issue Type: Improvement > Components: hive-integration >Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7 >Reporter: KaimingChen > > When i use sqoop merge tool, i can just specify one column using --merge-key > arguement. > But when my table has composite keys, i use --merge-key column1,column2 then > i got an Exception: > 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : > attempt_1470135750174_2508_m_04_2, Status : FAILED > Error: java.io.IOException: Cannot join values on null key. Did you specify a > key column that exists? > at > org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) > at > org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) -- This message was sent by Atlassian JIRA (v6.3.4#6332)