[jira] [Commented] (NUTCH-2362) Upgrade MaxMind GeoIP version in index-geoip

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293312#comment-16293312 ] Hudson commented on NUTCH-2362: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3481 (See

[jira] [Commented] (NUTCH-2362) Upgrade MaxMind GeoIP version in index-geoip

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293223#comment-16293223 ] ASF GitHub Bot commented on NUTCH-2362: --- sebastian-nagel closed pull request #262: N

[jira] [Resolved] (NUTCH-2362) Upgrade MaxMind GeoIP version in index-geoip

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2362. Resolution: Fixed > Upgrade MaxMind GeoIP version in index-geoip > -

[jira] [Commented] (NUTCH-2478) // is not a valid base URL

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293219#comment-16293219 ] Sebastian Nagel commented on NUTCH-2478: Ok, pull request [#263|https://github.com

[jira] [Commented] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.4

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293116#comment-16293116 ] Hudson commented on NUTCH-2354: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3480 (See

[jira] [Commented] (NUTCH-2480) Upgrade crawler-commons dependency to 0.9

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293115#comment-16293115 ] Hudson commented on NUTCH-2480: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3480 (See

[jira] [Resolved] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.4

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2354. Resolution: Fixed Thanks, everyone! > Upgrade Hadoop dependencies to 2.7.4 > --

[jira] [Commented] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.4

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293082#comment-16293082 ] ASF GitHub Bot commented on NUTCH-2354: --- sebastian-nagel closed pull request #261: N

[jira] [Resolved] (NUTCH-2480) Upgrade crawler-commons dependency to 0.9

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2480. Resolution: Fixed Assignee: Sebastian Nagel > Upgrade crawler-commons dependency to 0.

[jira] [Commented] (NUTCH-2480) Upgrade crawler-commons dependency to 0.9

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293080#comment-16293080 ] ASF GitHub Bot commented on NUTCH-2480: --- sebastian-nagel closed pull request #260: N

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292924#comment-16292924 ] Hudson commented on NUTCH-2439: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3479 (See

[jira] [Commented] (NUTCH-2035) Regex filter using case sensitive rules.

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292898#comment-16292898 ] Hudson commented on NUTCH-2035: --- SUCCESS: Integrated in Jenkins build Nutch-nutchgora #1599

[jira] [Resolved] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2439. Resolution: Fixed Merged into 1.x, thanks! > Upgrade to Apache Tika 1.17 >

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292837#comment-16292837 ] ASF GitHub Bot commented on NUTCH-2439: --- sebastian-nagel closed pull request #259: N

[jira] [Commented] (NUTCH-2157) Parent Issue for Addressing Miredot REST API Warnings

2017-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292830#comment-16292830 ] Lewis John McGibbney commented on NUTCH-2157: - There are still many warnings.

[jira] [Updated] (NUTCH-2157) Parent Issue for Addressing Miredot REST API Warnings

2017-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2157: Fix Version/s: (was: 1.14) 1.15 > Parent Issue for Addressing

[jira] [Resolved] (NUTCH-2181) Add Webpage for 3rd Party Connectors/Libraries to Apache Nutch

2017-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2181. - Resolution: Won't Fix Fix Version/s: 1.14 These are never kept up-to-date

[jira] [Updated] (NUTCH-2185) protocol-soda-consumer plugin

2017-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2185: Fix Version/s: (was: 1.15) 1.14 > protocol-soda-consumer plug

[jira] [Resolved] (NUTCH-2185) protocol-soda-consumer plugin

2017-12-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2185. - Resolution: Won't Fix This was a very limited use case and is not worth integratio

[jira] [Resolved] (NUTCH-2035) Regex filter using case sensitive rules.

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2035. Resolution: Fixed Assignee: Sebastian Nagel (was: Lewis John McGibbney) Fix

[jira] [Commented] (NUTCH-2035) Regex filter using case sensitive rules.

2017-12-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292802#comment-16292802 ] Hudson commented on NUTCH-2035: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3478 (See

[jira] [Updated] (NUTCH-2334) Extension point for schedulers

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2334: --- Fix Version/s: (was: 1.14) 1.15 > Extension point for schedulers >

[jira] [Updated] (NUTCH-2261) ParseSegment job does not pass metadata for content-level redirects

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2261: --- Fix Version/s: (was: 1.14) 1.15 > ParseSegment job does not pass metada

[jira] [Updated] (NUTCH-2419) Domain blacklist URL filter does not respect command-line override for file

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2419: --- Fix Version/s: (was: 1.14) 1.15 > Domain blacklist URL filter does not

[jira] [Updated] (NUTCH-2309) Scoring-Similarity Plugin raises NullPointerException when error occurs in fetching URL

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2309: --- Fix Version/s: (was: 1.14) 1.15 > Scoring-Similarity Plugin raises Null

[jira] [Updated] (NUTCH-2030) ParseZip plugin is not able to extract language from zip document,this could solve that problem.

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2030: --- Fix Version/s: (was: 1.14) 1.15 > ParseZip plugin is not able to extrac

[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1228: --- Fix Version/s: (was: 1.14) 1.15 > Change mapred.task.timeout to mapredu

[jira] [Updated] (NUTCH-1228) Change mapred.task.timeout to mapreduce.task.timeout in fetcher

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1228: --- Fix Version/s: 2.4 > Change mapred.task.timeout to mapreduce.task.timeout in fetcher > ---

[jira] [Updated] (NUTCH-2312) Support PhantomJS as a WebDriver in protocol-selenium

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2312: --- Fix Version/s: (was: 1.14) 1.15 > Support PhantomJS as a WebDriver in p

[jira] [Updated] (NUTCH-2247) Protocol resolver

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2247: --- Fix Version/s: (was: 1.14) 1.15 > Protocol resolver > -

[jira] [Updated] (NUTCH-2133) Transfer Selenium Documentation to WIki

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2133: --- Fix Version/s: (was: 1.14) 1.15 > Transfer Selenium Documentation to WI

[jira] [Updated] (NUTCH-2188) While crawling with solr url (kerberos enabled) Error: org.apache.solr.common.SolrException: Unauthorized

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2188: --- Fix Version/s: (was: 1.14) 1.15 > While crawling with solr url (kerbero

[jira] [Updated] (NUTCH-2033) parse-tika skips valid documents.

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2033: --- Fix Version/s: (was: 1.14) 1.15 > parse-tika skips valid documents. > -

[jira] [Updated] (NUTCH-2369) Create a new GraphGenerator Tool for writing Nutch Records as a Full Web Graph

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2369: --- Fix Version/s: (was: 1.14) 1.15 > Create a new GraphGenerator Tool for

[jira] [Commented] (NUTCH-2157) Parent Issue for Addressing Miredot REST API Warnings

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292692#comment-16292692 ] Sebastian Nagel commented on NUTCH-2157: There is a successful commit. Is this fix

[jira] [Updated] (NUTCH-2156) Dump via Services end point

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2156: --- Fix Version/s: (was: 1.14) 1.15 > Dump via Services end point > --

[jira] [Updated] (NUTCH-2151) Service endpoint for REST API

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2151: --- Fix Version/s: (was: 1.14) 1.15 > Service endpoint for REST API > -

[jira] [Updated] (NUTCH-2147) MetadataScoringFilter for Nutch

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2147: --- Fix Version/s: (was: 1.14) 1.15 > MetadataScoringFilter for Nutch > ---

[jira] [Updated] (NUTCH-1943) Form authentication should not be global and ignore

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1943: --- Fix Version/s: (was: 1.14) 1.15 > Form authentication should not be glo

[jira] [Updated] (NUTCH-2032) Plugin to index the raw content of a readable document.

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2032: --- Fix Version/s: (was: 1.14) 1.15 > Plugin to index the raw content of a

[jira] [Updated] (NUTCH-2363) Fetcher support for reading and setting cookies

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2363: --- Fix Version/s: (was: 1.14) 1.15 > Fetcher support for reading and setti

[jira] [Updated] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2162: --- Fix Version/s: (was: 1.14) 1.15 > Nutch Webapp Crawl fails as it tries

[jira] [Updated] (NUTCH-2181) Add Webpage for 3rd Party Connectors/Libraries to Apache Nutch

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2181: --- Fix Version/s: (was: 1.14) > Add Webpage for 3rd Party Connectors/Libraries to Apache Nutc

[jira] [Updated] (NUTCH-2214) Index clean to be flexible on what it deletes

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2214: --- Fix Version/s: (was: 1.14) 1.15 > Index clean to be flexible on what it

[jira] [Updated] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2209: --- Fix Version/s: (was: 1.14) 1.15 > Improved Tokenization for Similarity

[jira] [Updated] (NUTCH-2185) protocol-soda-consumer plugin

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2185: --- Fix Version/s: (was: 1.14) 1.15 > protocol-soda-consumer plugin > -

[jira] [Updated] (NUTCH-2265) Write A Test Package for Scoring Similarity

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2265: --- Fix Version/s: (was: 1.14) 1.15 > Write A Test Package for Scoring Simi

[jira] [Updated] (NUTCH-2292) Mavenize the build for nutch-core and nutch-plugins

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2292: --- Fix Version/s: (was: 1.14) 1.15 > Mavenize the build for nutch-core and

[jira] [Commented] (NUTCH-2362) Upgrade MaxMind GeoIP version in index-geoip

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292680#comment-16292680 ] ASF GitHub Bot commented on NUTCH-2362: --- sebastian-nagel opened a new pull request #

[jira] [Created] (NUTCH-2482) index-geoip not to add null values to document fields

2017-12-15 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-2482: -- Summary: index-geoip not to add null values to document fields Key: NUTCH-2482 URL: https://issues.apache.org/jira/browse/NUTCH-2482 Project: Nutch Issue

[jira] [Updated] (NUTCH-2412) Exchange component for indexing job

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2412: --- Fix Version/s: (was: 1.14) 1.15 > Exchange component for indexing job >

[jira] [Updated] (NUTCH-2249) WordNet Integration for Cosine Similarity

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2249: --- Fix Version/s: (was: 1.14) 1.15 > WordNet Integration for Cosine Simila

[jira] [Updated] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.4

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2354: --- Summary: Upgrade Hadoop dependencies to 2.7.4 (was: Upgrade Hadoop dependencies to 2.7.3) >

[jira] [Updated] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.4

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2354: --- Patch Info: Patch Available > Upgrade Hadoop dependencies to 2.7.4 > -

[jira] [Commented] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.3

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292630#comment-16292630 ] ASF GitHub Bot commented on NUTCH-2354: --- sebastian-nagel opened a new pull request #

[jira] [Commented] (NUTCH-2480) Upgrade crawler-commons dependency to 0.9

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292556#comment-16292556 ] ASF GitHub Bot commented on NUTCH-2480: --- sebastian-nagel opened a new pull request #

Re: [DISCUSS] Release 1.14?

2017-12-15 Thread Sebastian Nagel
Ok, the pull request for the upgrade to Tika 1.17 is ready: https://issues.apache.org/jira/browse/NUTCH-2439 https://github.com/apache/nutch/pull/259 Thanks, Sebastian On 12/14/2017 10:44 AM, Julien Nioche wrote: > FYI Tika 1.17 has just been released  > http://www.apache.org/dist/tika/CHANG

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292496#comment-16292496 ] ASF GitHub Bot commented on NUTCH-2439: --- lewismc commented on issue #259: NUTCH-2439

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292486#comment-16292486 ] Sebastian Nagel commented on NUTCH-2439: Ok, got it: of course, I have to add a ti

[jira] [Updated] (NUTCH-2481) HostDatum deltas(previous step statistics)

2017-12-15 Thread Semyon Semyonov (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Semyon Semyonov updated NUTCH-2481: --- Description: To allow the usage of previous step statistics(deltas of fetched,unfetced etc) i

[jira] [Updated] (NUTCH-2481) HostDatum deltas(previous step statistics)

2017-12-15 Thread Semyon Semyonov (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Semyon Semyonov updated NUTCH-2481: --- Description: To allow the usage of previous step statistics(deltas of fetched,unfetced etc) i

[jira] [Created] (NUTCH-2481) HostDatum deltas(previous step statistics)

2017-12-15 Thread Semyon Semyonov (JIRA)
Semyon Semyonov created NUTCH-2481: -- Summary: HostDatum deltas(previous step statistics) Key: NUTCH-2481 URL: https://issues.apache.org/jira/browse/NUTCH-2481 Project: Nutch Issue Type: Impr

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292476#comment-16292476 ] Sebastian Nagel commented on NUTCH-2439: Of course, I get the warning about Tesser

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292474#comment-16292474 ] ASF GitHub Bot commented on NUTCH-2439: --- sebastian-nagel opened a new pull request #

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292469#comment-16292469 ] Markus Jelsma commented on NUTCH-2439: -- Weird, i only got : Dec 15, 2017 1:45:42 PM

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292450#comment-16292450 ] Sebastian Nagel commented on NUTCH-2439: Really? I've almost done with a PR for th

[jira] [Commented] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292421#comment-16292421 ] Markus Jelsma commented on NUTCH-2439: -- Note, since 1.17, all but one of the warnings

[jira] [Commented] (NUTCH-2478) // is not a valid base URL

2017-12-15 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292419#comment-16292419 ] Markus Jelsma commented on NUTCH-2478: -- I prefer your patch, it also carries a test.

[jira] [Created] (NUTCH-2480) Upgrade crawler-commons dependency to 0.9

2017-12-15 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-2480: -- Summary: Upgrade crawler-commons dependency to 0.9 Key: NUTCH-2480 URL: https://issues.apache.org/jira/browse/NUTCH-2480 Project: Nutch Issue Type: Impro

[jira] [Comment Edited] (NUTCH-2439) Upgrade to Apache Tika 1.17

2017-12-15 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208309#comment-16208309 ] Sebastian Nagel edited comment on NUTCH-2439 at 12/15/17 10:45 AM: -

[jira] [Commented] (NUTCH-2415) Create a JEXL based IndexingFilter

2017-12-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292231#comment-16292231 ] ASF GitHub Bot commented on NUTCH-2415: --- sebastian-nagel commented on a change in pu