[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117024#comment-15117024 ] Markus Jelsma commented on NUTCH-961: - Yes! :) > Expose Tika's boilerpipe support >

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117020#comment-15117020 ] Tien Nguyen Manh commented on NUTCH-961: Can NUTCH-1233: use tika to extract outlink solve that

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116975#comment-15116975 ] Markus Jelsma commented on NUTCH-961: - With boilerpipe, you get only a very few outlinks, those found

[jira] [Updated] (NUTCH-1465) Support sitemaps in Nutch

2016-01-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1465: - Fix Version/s: 1.13 > Support sitemaps in Nutch > - > >

[jira] [Commented] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118689#comment-15118689 ] Chris A. Mattmann commented on NUTCH-2206: -- +1 please commit > Provide example

[jira] [Updated] (NUTCH-1741) Support of Sitemaps in Nutch 2.x

2016-01-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1741: Attachment: NUTCH-1741v7.patch Managed to update this at the weekend and forgot to

[jira] [Created] (NUTCH-2208) Fix 4 skipped tests in TestGenerator

2016-01-26 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-2208: --- Summary: Fix 4 skipped tests in TestGenerator Key: NUTCH-2208 URL: https://issues.apache.org/jira/browse/NUTCH-2208 Project: Nutch Issue Type:

[jira] [Updated] (NUTCH-2208) Fix 4 skipped tests in TestGenerator

2016-01-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2208: Attachment: TEST-org.apache.nutch.crawl.TestGenerator.txt Attached is full test log

[jira] [Resolved] (NUTCH-1741) Support of Sitemaps in Nutch 2.x

2016-01-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1741. - Resolution: Fixed Committed revision 1726853 in 2.X Thank you to everyone that

[jira] [Updated] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2206: -- Attachment: NUTCH-2206.patch Hey [~lewismc], here's the patch providing an example for the stopword

[jira] [Commented] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117800#comment-15117800 ] Lewis John McGibbney commented on NUTCH-2206: - We should most likely also provide the

[jira] [Commented] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117810#comment-15117810 ] Sujen Shah commented on NUTCH-2206: --- Ohh yes, will do it now, missed it in the patch. > Provide

[jira] [Commented] (NUTCH-1741) Support of Sitemaps in Nutch 2.x

2016-01-26 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117839#comment-15117839 ] Hudson commented on NUTCH-1741: --- SUCCESS: Integrated in Nutch-nutchgora #1548 (See

[jira] [Updated] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujen Shah updated NUTCH-2206: -- Attachment: NUTCH-2206.patch Added example for the property in nutch-default.xml > Provide example

[jira] [Commented] (NUTCH-1712) Use MultipleInputs in Injector to make it a single mapreduce job

2016-01-26 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117963#comment-15117963 ] ASF GitHub Bot commented on NUTCH-1712: --- GitHub user sebastian-nagel opened a pull request:

[jira] [Commented] (NUTCH-2206) Provide example scoring.similarity.stopword.file

2016-01-26 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118286#comment-15118286 ] Lewis John McGibbney commented on NUTCH-2206: - +1 [~sujenshah], thanks > Provide example

Re: need suggestion for GSoC 2016

2016-01-26 Thread Lewis John Mcgibbney
Hi Ammar, I've given you write permissions for the wiki. Feel free to create a page for your proposed work at the URL below https://wiki.apache.org/nutch/GoogleSummerOfCode#A2016 On Fri, Jan 22, 2016 at 4:49 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Ammar, > CC dev@ >

[Nutch Wiki] Trivial Update of "ContributorsGroup" by LewisJohnMcgibbney

2016-01-26 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The "ContributorsGroup" page has been changed by LewisJohnMcgibbney: https://wiki.apache.org/nutch/ContributorsGroup?action=diff=36=37 * PeterCiuffetti * ayeshahasan * Kshamak