[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-03-19 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203025#comment-15203025 ] ASF GitHub Bot commented on NUTCH-961: -- Github user asfgit closed the pull request at:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170557#comment-15170557 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170556#comment-15170556 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170555#comment-15170555 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170554#comment-15170554 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-26 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168821#comment-15168821 ] ASF GitHub Bot commented on NUTCH-961: -- GitHub user jeremie70 opened a pull request:

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148758#comment-15148758 ] Hudson commented on NUTCH-961: -- SUCCESS: Integrated in Nutch-trunk #3347 (See

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-16 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148642#comment-15148642 ] Markus Jelsma commented on NUTCH-961: - Tests pass as expected and Boilerpipe as well. Will commit

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117024#comment-15117024 ] Markus Jelsma commented on NUTCH-961: - Yes! :) > Expose Tika's boilerpipe support >

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117020#comment-15117020 ] Tien Nguyen Manh commented on NUTCH-961: Can NUTCH-1233: use tika to extract outlink solve that

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116975#comment-15116975 ] Markus Jelsma commented on NUTCH-961: - With boilerpipe, you get only a very few outlinks, those found

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-25 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116772#comment-15116772 ] Tien Nguyen Manh commented on NUTCH-961: AH yes, Could you explain why we need to parse it twice?

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-25 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114989#comment-15114989 ] Markus Jelsma commented on NUTCH-961: - That is probably due to the patch parsing twice. Once with BP

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-24 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114658#comment-15114658 ] Tien Nguyen Manh commented on NUTCH-961: One note with boilerpipe support, it is significant slower

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-21 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110373#comment-15110373 ] Markus Jelsma commented on NUTCH-961: - Hello - that doesn't seem related to this issue as it doesn't

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-21 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111292#comment-15111292 ] Markus Jelsma commented on NUTCH-961: - Some news, the upstream Tika issue has been committed and

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-20 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110217#comment-15110217 ] Tien Nguyen Manh commented on NUTCH-961: i'm using this patch NUTCH-961-1.11-1.patch, it works fine

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-19 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106570#comment-15106570 ] Markus Jelsma commented on NUTCH-961: - Yes but it requires NUTCH-1233. > Expose Tika's boilerpipe

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-19 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106783#comment-15106783 ] Markus Jelsma commented on NUTCH-961: - Update, i've updated NUTCH-1233 for current trunk as well as a

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-01-18 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106152#comment-15106152 ] Otis Gospodnetic commented on NUTCH-961: Any chance we could commit this,

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2015-04-01 Thread Alexander Kingson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391558#comment-14391558 ] Alexander Kingson commented on NUTCH-961: - Hello, Since I was not getting

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2014-02-13 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900180#comment-13900180 ] Markus Jelsma commented on NUTCH-961: - I am sorry, i did not mean to speak for the

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2014-02-12 Thread Matzz (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13899044#comment-13899044 ] Matzz commented on NUTCH-961: - {quote}We don't use it BP anymore {quote} BP integration will

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-10-08 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789686#comment-13789686 ] Otis Gospodnetic commented on NUTCH-961: Looks like [~kkrugler] is offering to help

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-10-08 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789735#comment-13789735 ] Markus Jelsma commented on NUTCH-961: - Hi Otis - there are no significant improvements

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-10-08 Thread Otis Gospodnetic (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789894#comment-13789894 ] Otis Gospodnetic commented on NUTCH-961: bq. We don't use it BP anymore What do

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-10-07 Thread Nguyen Manh Tien (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788911#comment-13788911 ] Nguyen Manh Tien commented on NUTCH-961: I used patch NUTCH-961-2.1-v2.patch for

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-03-29 Thread Miles Rowland (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617739#comment-13617739 ] Miles Rowland commented on NUTCH-961: - Roland, thanks for porting to 2.1. I'm having an

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-03-04 Thread Roland (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592056#comment-13592056 ] Roland commented on NUTCH-961: -- Kiran, did you already start porting it to 2.x?

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-03-04 Thread kiran (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592179#comment-13592179 ] kiran commented on NUTCH-961: - No Roland, not yet. I just switched to using 1.x series, but i

Re: [jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-03-04 Thread Roland
Hey Kiran, drop me a line prior to starting, I will give it a try tomorrow (I hope). --Roland Am 04.03.2013 14:13, schrieb kiran (JIRA): [

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-02-19 Thread kiran (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581459#comment-13581459 ] kiran commented on NUTCH-961: - Markus, do you think this patch can also work for 2.x Series ?

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2013-02-19 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581530#comment-13581530 ] Markus Jelsma commented on NUTCH-961: - Should work fine, parse plugins have not changed

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-12-27 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176194#comment-13176194 ] Markus Jelsma commented on NUTCH-961: - Fixed already. See NUTCH-1233 for a patch!

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-10 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047130#comment-13047130 ] Gabriele Kahlout commented on NUTCH-961: {quote}it needs to use a different

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-10 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047490#comment-13047490 ] Ken Krugler commented on NUTCH-961: --- The way that Boilerpipe in Tika works is that it

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-10 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047501#comment-13047501 ] Markus Jelsma commented on NUTCH-961: - Ah, that's great! Is this in 0.9 or trunk? We

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-04-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025286#comment-13025286 ] Gabriele Kahlout commented on NUTCH-961: @Markus - Thank you. Watch out for [1] in

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-04-26 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025295#comment-13025295 ] Markus Jelsma commented on NUTCH-961: - Not safely, there are still issues regarding

[jira] Commented: (NUTCH-961) Expose Tika's boilerpipe support

2011-01-27 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987575#action_12987575 ] Markus Jelsma commented on NUTCH-961: - Boilerpipe comes with several algorithms for