Yep, Seb, that’s right.
I have a student (Sujeh Shah) at USC working on
Nutch REST 1.x API, with the goal of eventually
making D3 visualizations of crawl graphs and
seeing what’s going on in a crawl while it’s
happening! :)
We are working on Wiki pages and have some patches
coming on that that
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The ContributorsGroup page has been changed by ChrisMattmann:
https://wiki.apache.org/nutch/ContributorsGroup?action=diffrev1=20rev2=21
Comment:
- wiki updates += JayavanthShenoy
*
[
https://issues.apache.org/jira/browse/NUTCH-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335058#comment-14335058
]
Lewis John McGibbney commented on NUTCH-1949:
-
Thanks for logging this
[
https://issues.apache.org/jira/browse/NUTCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1086:
Assignee: Fabio Santagostino
Rewrite protocol-httpclient
[
https://issues.apache.org/jira/browse/NUTCH-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1949:
Assignee: Giuseppe Totaro
Dump out the Nuth data into the Common Crawl format
Hi!
How about the forgotten gora-sql?
Alfonso Nishikawa
2015-02-07 5:20 GMT-01:00 Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov:
I am very much for figuring out how to do a Nutch + Spark - +1!
++
Chris Mattmann,
Chong Li created NUTCH-1950:
---
Summary: File name too long when bin/nutch dump
Key: NUTCH-1950
URL: https://issues.apache.org/jira/browse/NUTCH-1950
Project: Nutch
Issue Type: Improvement
Hi,
yes, there is a Nutch server providing a REST Api
and a web app client to run Nutch (as result of our
participation in GSoc 2014 by Fjodor Vershinin).
There are some limitations:
- only 2.x for now (please, follow NUTCH-1040 for a 1.x port)
- not complete (e.g., cannot configure a crawl)
For
[
https://issues.apache.org/jira/browse/NUTCH-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1946:
Attachment: NUTCH-1946.patch
Patch for 2.x, not currently 100. Issues are described
[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335641#comment-14335641
]
Mohammad Al-Mohsin commented on NUTCH-1933:
---
Thanks for your comments
[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335538#comment-14335538
]
Lewis John McGibbney commented on NUTCH-1933:
-
Hi [~almohsin], GREAT :)
Lewis John McGibbney created NUTCH-1952:
---
Summary: Add a timezone to the Nutch log4j.properties configuration
Key: NUTCH-1952
URL: https://issues.apache.org/jira/browse/NUTCH-1952
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335644#comment-14335644
]
Lewis John McGibbney commented on NUTCH-1933:
-
Yes, please read
[
https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mohammad Al-Mohsin updated NUTCH-1933:
--
Attachment: NUTCH-selenium-trunk.v2.1.patch
Hi Lewis,
Patch updated with comment files
Dear Wiki user,
You have subscribed to a wiki page or wiki category on Nutch Wiki for change
notification.
The bin/nutch parse page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/bin/nutch%20parse?action=diffrev1=2rev2=3
Check Fetcher.java and FetcherOutput.java for
15 matches
Mail list logo