[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-09-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521446#comment-15521446
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user liz-z17 commented on the issue:

https://github.com/apache/sqoop/pull/26
  
Hi @kevin00chen ,
Sqoop project currently does not accept pull requests. To contribute, you 
will need to generate patch and upload it to the JIRA (of course if there's no 
corresponding JIRA issue, you will also need to create one first).
See the instructions here: 
https://cwiki.apache.org/confluence/display/SQOOP/How+to+Contribute
If you need more directions, I'd be happy to help!


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430986#comment-15430986
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user kevin00chen commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/26#discussion_r75698508
  
--- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
@@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
 }
 Object keyObj = null;
 if (keyColName.contains(",")) {
+String connectStr = new String(new byte[]{1});
 StringBuilder keyFieldsSb = new StringBuilder();
 for (String str : keyColName.split(",")) {
-keyFieldsSb.append("+").append(fieldMap.get(str).toString());
+
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
--- End diff --

for example one table has two column, a and b

Field a | Field b
 | -
a+ | b
a | +b

when use "+" to connect two field, two record will has same keyObj.
To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430980#comment-15430980
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

Github user kevin00chen commented on a diff in the pull request:

https://github.com/apache/sqoop/pull/26#discussion_r75698166
  
--- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
@@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
 }
 Object keyObj = null;
 if (keyColName.contains(",")) {
+String connectStr = new String(new byte[]{1});
 StringBuilder keyFieldsSb = new StringBuilder();
 for (String str : keyColName.split(",")) {
-keyFieldsSb.append("+").append(fieldMap.get(str).toString());
+
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
--- End diff --

for example one table has two column, a and b
Field a | Field b
 | -
a+ | b
a | +b

when use "+" to connect two field, two record will has same keyObj.
To avoid this i use a String contains one byte.


> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-3002) Sqoop Merge Tool support composite merge-key

2016-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430912#comment-15430912
 ] 

ASF GitHub Bot commented on SQOOP-3002:
---

GitHub user kevin00chen opened a pull request:

https://github.com/apache/sqoop/pull/26

[SQOOP-3002] sqoop merge tool composite merge-key

JIRA Issue:https://issues.apache.org/jira/browse/SQOOP-3002
Sqoop Merge Tool can just specify one column by using --merge-key argument.
When i need to specify two or more column, i need to modify some source code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevin00chen/sqoop my_change

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/sqoop/pull/26.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #26


commit cd1e840c8dfb6261aa3be81b9c4881e80bc038bd
Author: KaimingChen 
Date:   2016-08-22T14:34:45Z

sqoop merge tool composite merge-key




> Sqoop Merge Tool support composite merge-key
> 
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
>  Issue Type: Improvement
>  Components: hive-integration
>Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
>Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key 
> arguement. 
> But when my table has composite keys, i use --merge-key column1,column2 then 
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id : 
> attempt_1470135750174_2508_m_04_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a 
> key column that exists?
>   at 
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
>   at 
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)