[
https://issues.apache.org/jira/browse/NUTCH-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450043#comment-17450043
]
Lewis John McGibbney commented on NUTCH-2838:
---------------------------------------------
Hi [~abstractdog] thanks for commenting. (For me atleast) This is definitely on
the Nutch roadmap. I did some initial experimentation which I documented at
https://cwiki.apache.org/confluence/display/NUTCH/Running+Nutch+on+Tez
At that time I ran into issues with the code implementation because I was
trying to have as little impact on the Nutch codebase as possible. That is to
say, I was trying to avoid an entire re-write of all existing (~18) Nutch
MapReduce jobs.
That being said, once I finish up my current work on documenting [Nutch
metrics|https://cwiki.apache.org/confluence/display/NUTCH/Metrics], I will come
back to this issue.
One of the things I would _*like to do*_ is actually provide documentation for
the Tez community to see how we went about migrating from MR to Tez... so I
will continue to document things as they go along.
As of writing I don't want to take on too much work. I will come back to this
task once my current work is finished.
As a wild request, I wonder if you would be interested in looking at the [Nutch
Injector|https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/crawl/Injector.java]
(basically NUTCH-2839) which is the first MR job in a Nutch crawl cycle. If
you were able to provide some wisdom as to how you would evolve that MR job -->
Tez it would be great to observe your engineering methodology. No problems if
this is not possible I thought I would ask as a stretch :)
> Apache Tez integration
> ----------------------
>
> Key: NUTCH-2838
> URL: https://issues.apache.org/jira/browse/NUTCH-2838
> Project: Nutch
> Issue Type: New Feature
> Components: deployment, runtime, tez
> Affects Versions: 1.18
> Reporter: Lewis John McGibbney
> Priority: Major
> Fix For: 1.19
>
>
> This is a parent epic under which all Tez integration tasks can be nested.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)