[ 
https://issues.apache.org/jira/browse/NUTCH-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16079394#comment-16079394
 ] 

ASF GitHub Bot commented on NUTCH-2375:
---------------------------------------

lewismc commented on a change in pull request #188: NUTCH-2375 Upgrade the code 
base from org.apache.hadoop.mapred to org.apache.hadoop.mapreduce
URL: https://github.com/apache/nutch/pull/188#discussion_r126293242
 
 

 ##########
 File path: src/java/org/apache/nutch/segment/SegmentMerger.java
 ##########
 @@ -384,40 +385,43 @@ public void setConf(Configuration conf) {
   public void close() throws IOException {
   }
 
-  public void configure(JobConf conf) {
+  public void configure(Job job) {
+    Configuration conf = job.getConfiguration();
     setConf(conf);
     if (sliceSize > 0) {
-      sliceSize = sliceSize / conf.getNumReduceTasks();
+      sliceSize = sliceSize / 
Integer.parseInt(conf.get("mapreduce.map.tasks"));
     }
   }
 
-  private Text newKey = new Text();
 
-  public void map(Text key, MetaWrapper value,
-      OutputCollector<Text, MetaWrapper> output, Reporter reporter)
-      throws IOException {
-    String url = key.toString();
-    if (normalizers != null) {
-      try {
-        url = normalizers.normalize(url, URLNormalizers.SCOPE_DEFAULT); // 
normalize
-                                                                        // the
-                                                                        // url
-      } catch (Exception e) {
-        LOG.warn("Skipping " + url + ":" + e.getMessage());
-        url = null;
+  public static class SegmentMergerMapper extends
+      Mapper<Text, MetaWrapper, Text, MetaWrapper> {
+    public void map(Text key, MetaWrapper value,
+        Context context) throws IOException, InterruptedException {
+      Text newKey = new Text();
+      String url = key.toString();
+      if (normalizers != null) {
+        try {
+          url = normalizers.normalize(url, URLNormalizers.SCOPE_DEFAULT); // 
normalize
 
 Review comment:
   Sort the comment out, make sure everything is on the same line. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Upgrade the code base from org.apache.hadoop.mapred to 
> org.apache.hadoop.mapreduce
> ----------------------------------------------------------------------------------
>
>                 Key: NUTCH-2375
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2375
>             Project: Nutch
>          Issue Type: Improvement
>          Components: deployment
>            Reporter: Omkar Reddy
>
> Nutch is still using the deprecated org.apache.hadoop.mapred dependency which 
> has been deprecated. It need to be updated to org.apache.hadoop.mapreduce 
> dependency. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to