[
https://issues.apache.org/jira/browse/NUTCH-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12930588#action_12930588
]
Sebastian Nagel commented on NUTCH-933:
---
The modifiedTime stored in a CrawlDatum
Sebastian Nagel created NUTCH-1344:
--
Summary: BasicURLNormalizer to normalize https same as http
Key: NUTCH-1344
URL: https://issues.apache.org/jira/browse/NUTCH-1344
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1344:
---
Attachment: NUTCH-1344.patch
BasicURLNormalizer to normalize https same as http
[
https://issues.apache.org/jira/browse/NUTCH-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258827#comment-13258827
]
Sebastian Nagel commented on NUTCH-1339:
BasicURLNormalizer does not remove the
[
https://issues.apache.org/jira/browse/NUTCH-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263124#comment-13263124
]
Sebastian Nagel commented on NUTCH-1293:
The content type should be added to
[
https://issues.apache.org/jira/browse/NUTCH-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273954#comment-13273954
]
Sebastian Nagel commented on NUTCH-1323:
After a small test crawl on
Sebastian Nagel created NUTCH-1383:
--
Summary: IndexingFiltersChecker to show error message instead of
null pointer exception
Key: NUTCH-1383
URL: https://issues.apache.org/jira/browse/NUTCH-1383
[
https://issues.apache.org/jira/browse/NUTCH-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1383:
---
Attachment: NUTCH-1383.patch
patch for both null pointer exceptions
Sebastian Nagel created NUTCH-1389:
--
Summary: parsechecker and indexchecker to report truncated content
Key: NUTCH-1389
URL: https://issues.apache.org/jira/browse/NUTCH-1389
Project: Nutch
Sebastian Nagel created NUTCH-1415:
--
Summary: release packages to contain top level folder
apache-nutch-x.x
Key: NUTCH-1415
URL: https://issues.apache.org/jira/browse/NUTCH-1415
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1415:
---
Attachment: NUTCH-1415.patch
Fix ant targets tar-src, tar-bin, zip-src, zip-bin
Also set
[
https://issues.apache.org/jira/browse/NUTCH-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1415:
---
Attachment: NUTCH-1415-2.patch
Hi Lewis, you are completely right:
the tarfileset /
[
https://issues.apache.org/jira/browse/NUTCH-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1421:
---
Attachment: NUTCH-1421-1.patch
RegexURLNormalizer to only skip rules with invalid
Sebastian Nagel created NUTCH-1422:
--
Summary: reset signature for redirects
Key: NUTCH-1422
URL: https://issues.apache.org/jira/browse/NUTCH-1422
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1422:
---
Attachment: NUTCH-1422_redir_notmodified_log.txt
reset signature for redirects
[
https://issues.apache.org/jira/browse/NUTCH-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410905#comment-13410905
]
Sebastian Nagel commented on NUTCH-1328:
Duplicate of NUTCH-706
Sebastian Nagel created NUTCH-1436:
--
Summary: bin/nutch absent in zip package
Key: NUTCH-1436
URL: https://issues.apache.org/jira/browse/NUTCH-1436
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1436:
---
Attachment: NUTCH-1436.patch
Patch for branch-1.5.1 (if a new bin package is desired). For
Sebastian Nagel created NUTCH-1454:
--
Summary: parsing chm failed
Key: NUTCH-1454
URL: https://issues.apache.org/jira/browse/NUTCH-1454
Project: Nutch
Issue Type: Bug
Components:
[
https://issues.apache.org/jira/browse/NUTCH-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454282#comment-13454282
]
Sebastian Nagel commented on NUTCH-1467:
Since nutch.metadata.Metadata,
[
https://issues.apache.org/jira/browse/NUTCH-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-1415:
--
Assignee: Sebastian Nagel
release packages to contain top level folder
[
https://issues.apache.org/jira/browse/NUTCH-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13457753#comment-13457753
]
Sebastian Nagel commented on NUTCH-1415:
This has been fixed only for 1.5.1 and
[
https://issues.apache.org/jira/browse/NUTCH-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1415.
Resolution: Fixed
Fix Version/s: 2.1
1.6
committed to trunk
[
https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467990#comment-13467990
]
Sebastian Nagel commented on NUTCH-706:
---
Are there objections to apply and commit the
Sebastian Nagel created NUTCH-1476:
--
Summary: SegmentReader getStats should set parsed = -1 if no
parsing took place
Key: NUTCH-1476
URL: https://issues.apache.org/jira/browse/NUTCH-1476
Project:
[
https://issues.apache.org/jira/browse/NUTCH-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1476:
---
Attachment: NUTCH-1476.patch
SegmentReader getStats should set parsed = -1 if no
[
https://issues.apache.org/jira/browse/NUTCH-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel reassigned NUTCH-1252:
--
Assignee: Sebastian Nagel
SegmentReader -get shows wrong data
[
https://issues.apache.org/jira/browse/NUTCH-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471915#comment-13471915
]
Sebastian Nagel commented on NUTCH-1344:
Is there any reason why https should be
[
https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-706:
--
Fix Version/s: 2.2
Summary: Url regex normalizer: default pattern for session id
[
https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-706.
---
Resolution: Fixed
committed to trunk (revision 1396796) and 2.x (revision 1396795)
[
https://issues.apache.org/jira/browse/NUTCH-1344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1344.
Resolution: Fixed
Fix Version/s: 2.2
1.6
committed to trunk
[
https://issues.apache.org/jira/browse/NUTCH-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13473599#comment-13473599
]
Sebastian Nagel commented on NUTCH-706:
---
First commit erroneously with wrong patch.
[
https://issues.apache.org/jira/browse/NUTCH-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1252.
Resolution: Fixed
committed to trunk (revision 1397281)
SegmentReader
[
https://issues.apache.org/jira/browse/NUTCH-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1476.
Resolution: Fixed
committed to trunk (revision 1397298)
SegmentReader
[
https://issues.apache.org/jira/browse/NUTCH-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1383.
Resolution: Fixed
committed to trunk (revision 1397308)
[
https://issues.apache.org/jira/browse/NUTCH-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482644#comment-13482644
]
Sebastian Nagel commented on NUTCH-1467:
Hi Kiran,
thanks for the patch. After a
[
https://issues.apache.org/jira/browse/NUTCH-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1467:
---
Attachment: NUTCH-1467-TEST-1.patch
nutch 1.5.1 not able to parse mutliValued metatags
[
https://issues.apache.org/jira/browse/NUTCH-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1421.
Resolution: Fixed
Fix Version/s: 2.2
1.6
committed to trunk
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-578-TEST-1.patch
JUnit test to catch this problem and NUTCH-578: a
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-1.patch
FetchSchedule.setPageGoneSchedule is called exclusively for a
[
https://issues.apache.org/jira/browse/NUTCH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486290#comment-13486290
]
Sebastian Nagel commented on NUTCH-1482:
Markus, you are right: I remember the API
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-2.patch
NUTCH-1245-578-TEST-2.patch
Improved patches
[
https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486484#comment-13486484
]
Sebastian Nagel commented on NUTCH-578:
---
NUTCH-1245 provides a test to catch this
[
https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-578:
--
Attachment: NUTCH-578_v5.patch
URL fetched with 403 is generated over and over again
[
https://issues.apache.org/jira/browse/NUTCH-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487316#comment-13487316
]
Sebastian Nagel commented on NUTCH-1370:
+1
Would be nice to see also the number
[
https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487318#comment-13487318
]
Sebastian Nagel commented on NUTCH-578:
---
Resetting the retry counter in
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488146#comment-13488146
]
Sebastian Nagel commented on NUTCH-1483:
Confirmed.
The problem is caused by the
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1483:
---
Affects Version/s: 1.6
Can't crawl filesystem with protocol-file plugin
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488200#comment-13488200
]
Sebastian Nagel commented on NUTCH-1483:
I tried with 1.x/trunk.
For 2.x URLs with
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1483:
---
Attachment: NUTCH-1483.patch
StringUtils.split(String, char) does not preserve empty parts:
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488254#comment-13488254
]
Sebastian Nagel commented on NUTCH-1483:
Rogério, can you apply the patch,
Sebastian Nagel created NUTCH-1484:
--
Summary: TableUtil unreverseURL fails on file:// URLs
Key: NUTCH-1484
URL: https://issues.apache.org/jira/browse/NUTCH-1484
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488558#comment-13488558
]
Sebastian Nagel commented on NUTCH-1483:
Thanks!
Issue with un-reversing URLs
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488558#comment-13488558
]
Sebastian Nagel edited comment on NUTCH-1483 at 11/1/12 8:55 AM:
Sebastian Nagel created NUTCH-1485:
--
Summary: TableUtil reverseURL to keep userinfo part
Key: NUTCH-1485
URL: https://issues.apache.org/jira/browse/NUTCH-1485
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488585#comment-13488585
]
Sebastian Nagel commented on NUTCH-1461:
Cf. NUTCH-1484: same error with file://
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13488935#comment-13488935
]
Sebastian Nagel commented on NUTCH-1245:
They are not duplicates but the effects
Sebastian Nagel created NUTCH-1488:
--
Summary: bin/nutch to run junit from any directory
Key: NUTCH-1488
URL: https://issues.apache.org/jira/browse/NUTCH-1488
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1488:
---
Attachment: NUTCH-1488.patch
bin/nutch to run junit from any directory
[
https://issues.apache.org/jira/browse/NUTCH-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1484:
---
Attachment: NUTCH-1484.patch
Revised patch: replaced
[
https://issues.apache.org/jira/browse/NUTCH-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494952#comment-13494952
]
Sebastian Nagel edited comment on NUTCH-1484 at 11/11/12 7:56 PM:
[
https://issues.apache.org/jira/browse/NUTCH-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1484.
Resolution: Fixed
Committed to 2.x (rev. 1408465)
TableUtil unreverseURL
[
https://issues.apache.org/jira/browse/NUTCH-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1370:
---
Attachment: NUTCH-1370-1.x.patch
Ferdy is right: custom counters are more transparent.
Patch
[
https://issues.apache.org/jira/browse/NUTCH-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1370:
---
Attachment: NUTCH-1370-2.x-v3.patch
Hi Lewis, yes, the 1.x patch is not easily transferred
[
https://issues.apache.org/jira/browse/NUTCH-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504136#comment-13504136
]
Sebastian Nagel commented on NUTCH-1499:
Short and precise patch. However, is
Sebastian Nagel created NUTCH-1500:
--
Summary: bin/crawl fails on step solrindex with wrong path to
segment
Key: NUTCH-1500
URL: https://issues.apache.org/jira/browse/NUTCH-1500
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1500:
---
Attachment: NUTCH-1500.patch
bin/crawl fails on step solrindex with wrong path to
[
https://issues.apache.org/jira/browse/NUTCH-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507944#comment-13507944
]
Sebastian Nagel commented on NUTCH-1499:
Thanks! That's a plausible reason: (let's
[
https://issues.apache.org/jira/browse/NUTCH-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1038:
---
Attachment: NUTCH-1038.patch
Port IndexingFiltersChecker to 2.0
Sebastian Nagel created NUTCH-1501:
--
Summary: Harmonize behavior of parsechecker and indexchecker
Key: NUTCH-1501
URL: https://issues.apache.org/jira/browse/NUTCH-1501
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13525439#comment-13525439
]
Sebastian Nagel commented on NUTCH-1245:
@kiran: yes, 2.x is affected since fetch
[
https://issues.apache.org/jira/browse/NUTCH-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529497#comment-13529497
]
Sebastian Nagel commented on NUTCH-1503:
Hi Lewis,
both time limit properties are
[
https://issues.apache.org/jira/browse/NUTCH-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1038:
---
Attachment: NUTCH-1038v2.patch
Hi Lewis, it's a problem of the patch: the fetch time of a
[
https://issues.apache.org/jira/browse/NUTCH-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13545480#comment-13545480
]
Sebastian Nagel commented on NUTCH-1514:
+1
But do we need a reference to the
[
https://issues.apache.org/jira/browse/NUTCH-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552028#comment-13552028
]
Sebastian Nagel commented on NUTCH-1499:
So, a vote for won't fix. Comments?
[
https://issues.apache.org/jira/browse/NUTCH-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-813.
---
Resolution: Duplicate
The described problem is identical to that of NUTCH-578. The provided
[
https://issues.apache.org/jira/browse/NUTCH-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552082#comment-13552082
]
Sebastian Nagel commented on NUTCH-1345:
JAVA_HOME (or NUTCH_JAVA_HOME) is
[
https://issues.apache.org/jira/browse/NUTCH-1087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13554353#comment-13554353
]
Sebastian Nagel commented on NUTCH-1087:
Hi Tristan,
thanks for the patch! The
[
https://issues.apache.org/jira/browse/NUTCH-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1500.
Resolution: Fixed
committed to trunk (rev. 1433658)
bin/crawl fails on
[
https://issues.apache.org/jira/browse/NUTCH-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556093#comment-13556093
]
Sebastian Nagel commented on NUTCH-1520:
Hi Markus,
have a look at NUTCH-1113. An
[
https://issues.apache.org/jira/browse/NUTCH-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564274#comment-13564274
]
Sebastian Nagel commented on NUTCH-1465:
Hi Tejas,
thanks and a few comments on
[
https://issues.apache.org/jira/browse/NUTCH-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564768#comment-13564768
]
Sebastian Nagel commented on NUTCH-1465:
Yes, SitemapInjector is a map-reduce
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564827#comment-13564827
]
Sebastian Nagel commented on NUTCH-1047:
As some test for the interface started to
[
https://issues.apache.org/jira/browse/NUTCH-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584550#comment-13584550
]
Sebastian Nagel commented on NUTCH-1535:
Presumably, this is caused by
[
https://issues.apache.org/jira/browse/NUTCH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584824#comment-13584824
]
Sebastian Nagel commented on NUTCH-1031:
Hi Tejas, a test of
[
https://issues.apache.org/jira/browse/NUTCH-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1535.
Resolution: Not A Problem
Great!
Crawl crashes with java.io.exception
[
https://issues.apache.org/jira/browse/NUTCH-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel closed NUTCH-1535.
--
Crawl crashes with java.io.exception
[
https://issues.apache.org/jira/browse/NUTCH-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591045#comment-13591045
]
Sebastian Nagel commented on NUTCH-1537:
Removing stuff could be done in a few
[
https://issues.apache.org/jira/browse/NUTCH-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591062#comment-13591062
]
Sebastian Nagel commented on NUTCH-1467:
Hi Kiran, any updates regarding the unit
[
https://issues.apache.org/jira/browse/NUTCH-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591773#comment-13591773
]
Sebastian Nagel commented on NUTCH-1467:
Hi Kiran, my suggestion was only about
Sebastian Nagel created NUTCH-1541:
--
Summary: Indexer plugin to write CSV
Key: NUTCH-1541
URL: https://issues.apache.org/jira/browse/NUTCH-1541
Project: Nutch
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/NUTCH-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1541:
---
Attachment: NUTCH-1541-v1.patch
First version.
NOTE: NUTCH-1047 is required, the targets for
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595252#comment-13595252
]
Sebastian Nagel commented on NUTCH-1047:
Hi Julien,
in overall, all looks good. A
[
https://issues.apache.org/jira/browse/NUTCH-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595263#comment-13595263
]
Sebastian Nagel commented on NUTCH-1541:
Yes, the fields dumped are configurable.
[
https://issues.apache.org/jira/browse/NUTCH-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1541:
---
Attachment: NUTCH-1541-v2.patch
new patch including unit test
Indexer
[
https://issues.apache.org/jira/browse/NUTCH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603406#comment-13603406
]
Sebastian Nagel commented on NUTCH-1031:
+1 (nothing to complain)
P.S.: see
[
https://issues.apache.org/jira/browse/NUTCH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603482#comment-13603482
]
Sebastian Nagel commented on NUTCH-1031:
There are differences between trunk and
[
https://issues.apache.org/jira/browse/NUTCH-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613264#comment-13613264
]
Sebastian Nagel commented on NUTCH-1501:
2.x does not log to stdout. Add to 2.x
[
https://issues.apache.org/jira/browse/NUTCH-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613277#comment-13613277
]
Sebastian Nagel commented on NUTCH-1419:
Hi Lewis,
+1 for NUTCH-1419-trunk.patch
[
https://issues.apache.org/jira/browse/NUTCH-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1419.
Resolution: Fixed
Thanks Lewis!
parsechecker and indexchecker to report
1 - 100 of 3253 matches
Mail list logo