Re: questions about the webui packages

2015-02-24 Thread Mattmann, Chris A (3980)
Yep, Seb, that’s right. I have a student (Sujeh Shah) at USC working on Nutch REST 1.x API, with the goal of eventually making D3 visualizations of crawl graphs and seeing what’s going on in a crawl while it’s happening! :) We are working on Wiki pages and have some patches coming on that that

[Nutch Wiki] Update of ContributorsGroup by ChrisMattmann

2015-02-24 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The ContributorsGroup page has been changed by ChrisMattmann: https://wiki.apache.org/nutch/ContributorsGroup?action=diffrev1=20rev2=21 Comment: - wiki updates += JayavanthShenoy *

[jira] [Commented] (NUTCH-1949) Dump out the Nuth data into the Common Crawl format

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335058#comment-14335058 ] Lewis John McGibbney commented on NUTCH-1949: - Thanks for logging this

[jira] [Updated] (NUTCH-1086) Rewrite protocol-httpclient

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1086: Assignee: Fabio Santagostino Rewrite protocol-httpclient

[jira] [Updated] (NUTCH-1949) Dump out the Nuth data into the Common Crawl format

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1949: Assignee: Giuseppe Totaro Dump out the Nuth data into the Common Crawl format

Re: GSoC 2015

2015-02-24 Thread Alfonso Nishikawa
Hi! How about the forgotten gora-sql? Alfonso Nishikawa 2015-02-07 5:20 GMT-01:00 Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov: I am very much for figuring out how to do a Nutch + Spark - +1! ++ Chris Mattmann,

[jira] [Created] (NUTCH-1950) File name too long when bin/nutch dump

2015-02-24 Thread Chong Li (JIRA)
Chong Li created NUTCH-1950: --- Summary: File name too long when bin/nutch dump Key: NUTCH-1950 URL: https://issues.apache.org/jira/browse/NUTCH-1950 Project: Nutch Issue Type: Improvement

Re: questions about the webui packages

2015-02-24 Thread Sebastian Nagel
Hi, yes, there is a Nutch server providing a REST Api and a web app client to run Nutch (as result of our participation in GSoc 2014 by Fjodor Vershinin). There are some limitations: - only 2.x for now (please, follow NUTCH-1040 for a 1.x port) - not complete (e.g., cannot configure a crawl) For

[jira] [Updated] (NUTCH-1946) Upgrade to Gora 0.6

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1946: Attachment: NUTCH-1946.patch Patch for 2.x, not currently 100. Issues are described

[jira] [Commented] (NUTCH-1933) nutch-selenium plugin

2015-02-24 Thread Mohammad Al-Mohsin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335641#comment-14335641 ] Mohammad Al-Mohsin commented on NUTCH-1933: --- Thanks for your comments

[jira] [Commented] (NUTCH-1933) nutch-selenium plugin

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335538#comment-14335538 ] Lewis John McGibbney commented on NUTCH-1933: - Hi [~almohsin], GREAT :)

[jira] [Created] (NUTCH-1952) Add a timezone to the Nutch log4j.properties configuration

2015-02-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-1952: --- Summary: Add a timezone to the Nutch log4j.properties configuration Key: NUTCH-1952 URL: https://issues.apache.org/jira/browse/NUTCH-1952 Project: Nutch

[jira] [Commented] (NUTCH-1933) nutch-selenium plugin

2015-02-24 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335644#comment-14335644 ] Lewis John McGibbney commented on NUTCH-1933: - Yes, please read

[jira] [Updated] (NUTCH-1933) nutch-selenium plugin

2015-02-24 Thread Mohammad Al-Mohsin (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Al-Mohsin updated NUTCH-1933: -- Attachment: NUTCH-selenium-trunk.v2.1.patch Hi Lewis, Patch updated with comment files

[Nutch Wiki] Trivial Update of bin/nutch parse by LewisJohnMcgibbney

2015-02-24 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on Nutch Wiki for change notification. The bin/nutch parse page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/bin/nutch%20parse?action=diffrev1=2rev2=3 Check Fetcher.java and FetcherOutput.java for