[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
----------------------------------------
Description:
Recent developments within the tracing community have brought projects like
Apache HTrace (Incubating) into the Apache Incubator opening up the possibility
of utilizing tracing logic to better understand distributed applications,
systems and systems-of-systems. As many will know, tracing involves a
specialized use of logging to record information about a program’s execution.
Although many use cases involve the use of tracing within distributed systems
such as Hadoop and databases, few tracing experiments belong within the field
of large scale, distributed Web search.
This issue will combine comprehensive tracing mechanisms in Apache HTrace
(Incubating) with the scalable, flexible crawling architecture presented by
Apache Nutch 2.X. Key takeaways from this presentation are development and
implementation, tracing guidance for your web search stack and future work in
this area.
was:
I've recently been mentoring the [Apache
HTrace|http://htrace.incubator.apache.org/] effort, a tracing framework for use
with distributed systems written in Java.
I think that being able to have fine grained tracing available within Nutch
would be a large strength and other string in our bows.
> Implement HTrace'ing in Nutch
> -----------------------------
>
> Key: NUTCH-2005
> URL: https://issues.apache.org/jira/browse/NUTCH-2005
> Project: Nutch
> Issue Type: New Feature
> Components: build
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Labels: gsoc2016
> Fix For: 2.4
>
>
> Recent developments within the tracing community have brought projects like
> Apache HTrace (Incubating) into the Apache Incubator opening up the
> possibility of utilizing tracing logic to better understand distributed
> applications, systems and systems-of-systems. As many will know, tracing
> involves a specialized use of logging to record information about a program’s
> execution. Although many use cases involve the use of tracing within
> distributed systems such as Hadoop and databases, few tracing experiments
> belong within the field of large scale, distributed Web search.
> This issue will combine comprehensive tracing mechanisms in Apache HTrace
> (Incubating) with the scalable, flexible crawling architecture presented by
> Apache Nutch 2.X. Key takeaways from this presentation are development and
> implementation, tracing guidance for your web search stack and future work in
> this area.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)