[jira] [Commented] (NUTCH-2222) re-fetch deletes all metadata except _csh_ and _rs_

2016-02-27 Thread Adnane B. (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170580#comment-15170580 ] Adnane B. commented on NUTCH-: -- Please let me know if this issue does not exist with any other

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170557#comment-15170557 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[GitHub] nutch pull request: Add the boilerpipe parsing adapted from NUTCH-...

2016-02-27 Thread lewismc
Github user lewismc commented on a diff in the pull request: https://github.com/apache/nutch/pull/92#discussion_r54332201 --- Diff: src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java --- @@ -109,7 +114,18 @@ public Parse getParse(String url, WebPage page) {

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170556#comment-15170556 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[GitHub] nutch pull request: Add the boilerpipe parsing adapted from NUTCH-...

2016-02-27 Thread lewismc
Github user lewismc commented on a diff in the pull request: https://github.com/apache/nutch/pull/92#discussion_r54332193 --- Diff: src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/BoilerpipeExtractorRepository.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170555#comment-15170555 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[GitHub] nutch pull request: Add the boilerpipe parsing adapted from NUTCH-...

2016-02-27 Thread lewismc
Github user lewismc commented on a diff in the pull request: https://github.com/apache/nutch/pull/92#discussion_r54332155 --- Diff: src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/BoilerpipeExtractorRepository.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2016-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170554#comment-15170554 ] ASF GitHub Bot commented on NUTCH-961: -- Github user lewismc commented on a diff in the pull request:

[GitHub] nutch pull request: Add the boilerpipe parsing adapted from NUTCH-...

2016-02-27 Thread lewismc
Github user lewismc commented on a diff in the pull request: https://github.com/apache/nutch/pull/92#discussion_r54332145 --- Diff: conf/nutch-default.xml --- @@ -876,6 +876,19 @@ + + + + tika.boilerpipe + false --- End diff --