[
https://issues.apache.org/jira/browse/NUTCH-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1640.
--
Resolution: Fixed
Committed revision 1529802.
Thanks Mitesh.
OOM in ParseSegment Phase
[
https://issues.apache.org/jira/browse/NUTCH-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788039#comment-13788039
]
Julien Nioche commented on NUTCH-1562:
--
Hi Seb
You are right about the order from
[
https://issues.apache.org/jira/browse/NUTCH-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1562.
--
Resolution: Fixed
Committed revision 1529813.
Order of execution for scoring filters
[
https://issues.apache.org/jira/browse/NUTCH-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1606:
-
Attachment: NUTCH-1606.patch
Synchronized methods on ObjectCache + calls from
Julien Nioche created NUTCH-1652:
Summary: Avoid instanciation of MimeUtil for each Content object
created
Key: NUTCH-1652
URL: https://issues.apache.org/jira/browse/NUTCH-1652
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1653:
-
Attachment: NUTCH-1653.patch
AbstractScoringFilter
-
Key
Julien Nioche created NUTCH-1653:
Summary: AbstractScoringFilter
Key: NUTCH-1653
URL: https://issues.apache.org/jira/browse/NUTCH-1653
Project: Nutch
Issue Type: Improvement
Affects
[
https://issues.apache.org/jira/browse/NUTCH-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1653:
-
Priority: Minor (was: Major)
AbstractScoringFilter
[
https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791396#comment-13791396
]
Julien Nioche commented on NUTCH-1568:
--
It would probably be simpler to first port
[
https://issues.apache.org/jira/browse/NUTCH-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1653.
--
Resolution: Fixed
Committed revision 1530979.
thanks Seb and Markus
AbstractScoringFilter
[
https://issues.apache.org/jira/browse/NUTCH-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792635#comment-13792635
]
Julien Nioche commented on NUTCH-1606:
--
Will commit shortly unless someone objects
[
https://issues.apache.org/jira/browse/NUTCH-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1606.
--
Resolution: Fixed
Committed revision 1531833.
Check that Factory classes use the cache
[
https://issues.apache.org/jira/browse/NUTCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796626#comment-13796626
]
Julien Nioche commented on NUTCH-1371:
--
Does anyone have a bit of time to test
[
https://issues.apache.org/jira/browse/NUTCH-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796642#comment-13796642
]
Julien Nioche commented on NUTCH-1377:
--
Hi,
What about having SOLR 4 as a separate
[
https://issues.apache.org/jira/browse/NUTCH-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796650#comment-13796650
]
Julien Nioche commented on NUTCH-1656:
--
nice one. +1
ParseMeta not passed
[
https://issues.apache.org/jira/browse/NUTCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797729#comment-13797729
]
Julien Nioche commented on NUTCH-1371:
--
Hi Talat. That would be great, the latest one
[
https://issues.apache.org/jira/browse/NUTCH-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797768#comment-13797768
]
Julien Nioche commented on NUTCH-1541:
--
Hi
line 342 needs to be
{code}
while
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-656:
Attachment: NUTCH-656.v2.patch
Attached is a new patch which creates a new db status
[
https://issues.apache.org/jira/browse/NUTCH-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802733#comment-13802733
]
Julien Nioche commented on NUTCH-1640:
--
Can't quite believe I'd managed to screw what
[
https://issues.apache.org/jira/browse/NUTCH-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811182#comment-13811182
]
Julien Nioche commented on NUTCH-1640:
--
Ian,
Branches such as 1.7 are snapshots done
[
https://issues.apache.org/jira/browse/NUTCH-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche closed NUTCH-1664.
Resolution: Invalid
Ask the mailing list if you have any specific issues when running Nutch. You
Julien Nioche created NUTCH-1666:
Summary: Optimisation for BasicURLNormalizer
Key: NUTCH-1666
URL: https://issues.apache.org/jira/browse/NUTCH-1666
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1666:
-
Attachment: NUTCH-1666.patch
Optimisation for BasicURLNormalizer
[
https://issues.apache.org/jira/browse/NUTCH-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1666.
--
Resolution: Fixed
Committed revision 1540654.
Thanks Markus!
Optimisation
[
https://issues.apache.org/jira/browse/NUTCH-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1100.
--
Resolution: Fixed
Committed revision 1540758.
We'll probably move to a more generic approach
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-656:
Attachment: NUTCH-656.v3.patch
Thanks for your comments Seb. This new patch addresses some
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-656:
Attachment: (was: NUTCH-656.v3.patch)
DeleteDuplicates based on crawlDB only
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-656:
Attachment: NUTCH-656.v3.patch
correct attachment
DeleteDuplicates based on crawlDB only
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-656.
-
Resolution: Fixed
Committed revision 1541883.
Committed with a few minor changes compared
Julien Nioche created NUTCH-1668:
Summary: Remove package org.apache.nutch.indexer.solr
Key: NUTCH-1668
URL: https://issues.apache.org/jira/browse/NUTCH-1668
Project: Nutch
Issue Type: Task
[
https://issues.apache.org/jira/browse/NUTCH-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1621.
--
Resolution: Fixed
Trunk : Committed revision 1541885.
2.x : Committed revision 1541886.
I
[
https://issues.apache.org/jira/browse/NUTCH-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1668:
-
Attachment: NUTCH-1668.patch
Patch which removes the indexer.solr subpackage and deprecates
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823716#comment-13823716
]
Julien Nioche commented on NUTCH-656:
-
[~wastl-nagel] yep, I did that as part of NUTCH
[
https://issues.apache.org/jira/browse/NUTCH-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-828.
-
Resolution: Won't Fix
A better approach is to operate within the parsing step, as explained
[
https://issues.apache.org/jira/browse/NUTCH-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1607.
--
Resolution: Not A Problem
Sorry for the later reply. A simple workaround is to modify
[
https://issues.apache.org/jira/browse/NUTCH-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1558.
--
Resolution: Won't Fix
see comments
CharEncodingForConversion in ParseData's ParseMeta
[
https://issues.apache.org/jira/browse/NUTCH-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1382.
--
Resolution: Won't Fix
The SOLR indexer has been replaced with a generic indexing mechanism
[
https://issues.apache.org/jira/browse/NUTCH-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1668.
--
Resolution: Fixed
Committed revision 1543010.
Remove package org.apache.nutch.indexer.solr
[
https://issues.apache.org/jira/browse/NUTCH-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1309.
--
Resolution: Incomplete
Not clear what the problem or improvement is. Please reopen
[
https://issues.apache.org/jira/browse/NUTCH-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833588#comment-13833588
]
Julien Nioche commented on NUTCH-1297:
--
I think it was long as in 'has many URLs
[
https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833653#comment-13833653
]
Julien Nioche commented on NUTCH-1630:
--
This is a large patch which seems to affect
Julien Nioche created NUTCH-1676:
Summary: Add rudimentary SSL support to protocol-http
Key: NUTCH-1676
URL: https://issues.apache.org/jira/browse/NUTCH-1676
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1676:
-
Attachment: NUTCH-1676.patch
Add rudimentary SSL support to protocol-http
[
https://issues.apache.org/jira/browse/NUTCH-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842518#comment-13842518
]
Julien Nioche commented on NUTCH-656:
-
Please open a new issue with your patch for 2.x
[
https://issues.apache.org/jira/browse/NUTCH-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850516#comment-13850516
]
Julien Nioche commented on NUTCH-1676:
--
Have been using this for a few weeks without
[
https://issues.apache.org/jira/browse/NUTCH-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851537#comment-13851537
]
Julien Nioche commented on NUTCH-1676:
--
Thanks for your comments Markus. Shall we
[
https://issues.apache.org/jira/browse/NUTCH-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855745#comment-13855745
]
Julien Nioche commented on NUTCH-1360:
--
Looks good mate, +1 to commit
Suport
[
https://issues.apache.org/jira/browse/NUTCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864190#comment-13864190
]
Julien Nioche commented on NUTCH-1371:
--
Talat,
Moving to Maven altogether won't
[
https://issues.apache.org/jira/browse/NUTCH-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13874706#comment-13874706
]
Julien Nioche commented on NUTCH-1707:
--
Doesn't https://issues.apache.org/jira/browse
[
https://issues.apache.org/jira/browse/NUTCH-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13874713#comment-13874713
]
Julien Nioche commented on NUTCH-1707:
--
makes sense. We do need a generic way
[
https://issues.apache.org/jira/browse/NUTCH-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880935#comment-13880935
]
Julien Nioche commented on NUTCH-1676:
--
Hi Markus. Isn't this patch for a different
[
https://issues.apache.org/jira/browse/NUTCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890616#comment-13890616
]
Julien Nioche commented on NUTCH-1371:
--
Hi Talat
bq. Actually I have some problems
[
https://issues.apache.org/jira/browse/NUTCH-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894304#comment-13894304
]
Julien Nioche commented on NUTCH-710:
-
Nope. The version tag is more of a reminder
[
https://issues.apache.org/jira/browse/NUTCH-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896403#comment-13896403
]
Julien Nioche commented on NUTCH-1707:
--
looks fine. +1
DummyIndexingWriter
Julien Nioche created NUTCH-1729:
Summary: Upgrade to Tika 1.5
Key: NUTCH-1729
URL: https://issues.apache.org/jira/browse/NUTCH-1729
Project: Nutch
Issue Type: Task
Components
[
https://issues.apache.org/jira/browse/NUTCH-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1729:
-
Attachment: NUTCH-1729-2.x.patch
patch for 2.x
Upgrade to Tika 1.5
[
https://issues.apache.org/jira/browse/NUTCH-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1729.
--
Resolution: Fixed
Upgrade to Tika 1.5
---
Key: NUTCH-1729
[
https://issues.apache.org/jira/browse/NUTCH-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908211#comment-13908211
]
Julien Nioche commented on NUTCH-1729:
--
Thanks Markus
Trunk Committed revision
[
https://issues.apache.org/jira/browse/NUTCH-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908229#comment-13908229
]
Julien Nioche commented on NUTCH-1729:
--
Not sure it was there in the first place
[
https://issues.apache.org/jira/browse/NUTCH-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915917#comment-13915917
]
Julien Nioche commented on NUTCH-1113:
--
Well done, thanks guys!
Merging segments
[
https://issues.apache.org/jira/browse/NUTCH-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949513#comment-13949513
]
Julien Nioche commented on NUTCH-1736:
--
Looks good and seems to have fixed the issue
Julien Nioche created NUTCH-1745:
Summary: Upgrade to ElasticSearch 1.1.0
Key: NUTCH-1745
URL: https://issues.apache.org/jira/browse/NUTCH-1745
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1745:
-
Attachment: NUTCH-1745.trunk.patch
Upgrade to ElasticSearch 1.1.0
[
https://issues.apache.org/jira/browse/NUTCH-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-351.
-
Resolution: Won't Fix
This issue has received no interest in nearly 8 years.
Protocol forward
[
https://issues.apache.org/jira/browse/NUTCH-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1739.
--
Resolution: Not a Problem
Marked as not a problem.
[~yangshangchuan] please close the issue
[
https://issues.apache.org/jira/browse/NUTCH-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1745.
--
Resolution: Fixed
Fix Version/s: 1.9
Trunk = Committed revision 1584722.
Thanks
Julien Nioche created NUTCH-1747:
Summary: Use AtomicInteger as semaphore in Fetcher
Key: NUTCH-1747
URL: https://issues.apache.org/jira/browse/NUTCH-1747
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1747:
-
Attachment: NUTCH-1747-trunk.patch
Use AtomicInteger as semaphore in Fetcher
[
https://issues.apache.org/jira/browse/NUTCH-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche reassigned NUTCH-207:
---
Assignee: Julien Nioche
Will see if I can port this patch to the current version
[
https://issues.apache.org/jira/browse/NUTCH-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961002#comment-13961002
]
Julien Nioche commented on NUTCH-1735:
--
+1 Nice to simplify the code of the Fetcher
[
https://issues.apache.org/jira/browse/NUTCH-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961009#comment-13961009
]
Julien Nioche commented on NUTCH-1687:
--
I like the idea but am a bit concerned
[
https://issues.apache.org/jira/browse/NUTCH-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-385.
-
Resolution: Not a Problem
This is not a problem but a discussion of how things work
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-490:
Component/s: (was: fetcher)
parser
Extension point with filters for Neko HTML
[
https://issues.apache.org/jira/browse/NUTCH-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1297.
--
Resolution: Won't Fix
NUTCH-1687 is a nicer approach + no feedback from original contributor
[
https://issues.apache.org/jira/browse/NUTCH-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1278.
--
Resolution: Won't Fix
No follow up from contributor + solution proposed quite invasive
[
https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-827:
Component/s: (was: fetcher)
protocol
HTTP POST Authentication
[
https://issues.apache.org/jira/browse/NUTCH-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1342:
-
Component/s: (was: fetcher)
protocol
Read time out protocol-http
Julien Nioche created NUTCH-1750:
Summary: Improvement of Fetcher's reportStatus
Key: NUTCH-1750
URL: https://issues.apache.org/jira/browse/NUTCH-1750
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1750:
-
Attachment: NUTCH-1750.patch
Improvement of Fetcher's reportStatus
[
https://issues.apache.org/jira/browse/NUTCH-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961401#comment-13961401
]
Julien Nioche commented on NUTCH-1750:
--
The patch attached improves a few things
[
https://issues.apache.org/jira/browse/NUTCH-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche reopened NUTCH-385:
-
Reopening as per Chris' comments.
Chris, do you want to contribute a better description
[
https://issues.apache.org/jira/browse/NUTCH-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche closed NUTCH-1750.
Improvement of Fetcher's reportStatus
-
Key
[
https://issues.apache.org/jira/browse/NUTCH-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1750.
--
Resolution: Fixed
Thanks Sebastian
Committed revision 1585905.
Improvement of Fetcher's
[
https://issues.apache.org/jira/browse/NUTCH-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971452#comment-13971452
]
Julien Nioche commented on NUTCH-1676:
--
Hi Markus - any progress on this issue? Would
[
https://issues.apache.org/jira/browse/NUTCH-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1720.
--
Resolution: Fixed
Thanks Walter!
Committed revision 1587923.
Duplicate lines
[
https://issues.apache.org/jira/browse/NUTCH-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971471#comment-13971471
]
Julien Nioche commented on NUTCH-1147:
--
Good idea not to force it to 1 but what about
[
https://issues.apache.org/jira/browse/NUTCH-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1603.
--
Resolution: Fixed
Committed revision 1587928.
ZIP parser complains about truncated PDF file
[
https://issues.apache.org/jira/browse/NUTCH-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971486#comment-13971486
]
Julien Nioche commented on NUTCH-1521:
--
Can we close this one?
CrawlDbFilter pass
[
https://issues.apache.org/jira/browse/NUTCH-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971477#comment-13971477
]
Julien Nioche commented on NUTCH-1697:
--
Hi Markus. Actually it does matter and BTW
[
https://issues.apache.org/jira/browse/NUTCH-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1743.
--
Resolution: Fixed
Committed revision 1587935.
parsechecker to show outlinks
[
https://issues.apache.org/jira/browse/NUTCH-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971508#comment-13971508
]
Julien Nioche edited comment on NUTCH-1743 at 4/16/14 2:56 PM
[
https://issues.apache.org/jira/browse/NUTCH-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971511#comment-13971511
]
Julien Nioche commented on NUTCH-1743:
--
2-x : Committed revision 1587936
[
https://issues.apache.org/jira/browse/NUTCH-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1743:
-
Comment: was deleted
(was: Trunk Committed revision 1587935.
)
parsechecker to show outlinks
Julien Nioche created NUTCH-1757:
Summary: ParserChecker to take custom metadata as input
Key: NUTCH-1757
URL: https://issues.apache.org/jira/browse/NUTCH-1757
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1757:
-
Attachment: NUTCH-1757.patch
ParserChecker to take custom metadata as input
Julien Nioche created NUTCH-1758:
Summary: IndexChecker to send document to IndexWriters
Key: NUTCH-1758
URL: https://issues.apache.org/jira/browse/NUTCH-1758
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1758:
-
Attachment: NUTCH-1758.patch
IndexChecker to send document to IndexWriters
[
https://issues.apache.org/jira/browse/NUTCH-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13971551#comment-13971551
]
Julien Nioche commented on NUTCH-1758:
--
The parameter -D doIndex=true must be either
[
https://issues.apache.org/jira/browse/NUTCH-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1760.
--
Resolution: Duplicate
Crawl script fails to find job file if called from outside bin dir
[
https://issues.apache.org/jira/browse/NUTCH-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1761.
--
Resolution: Fixed
Fix Version/s: 1.9
2.3
Thanks David. I have
901 - 1000 of 1456 matches
Mail list logo