[
https://issues.apache.org/jira/browse/NUTCH-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-454.
---
Closing all resolved issues with a non-fixed status.
> Review Debug Level Log Guards
> --
[
https://issues.apache.org/jira/browse/NUTCH-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-934.
---
Closing all resolved issues with a non-fixed status.
> Upgrade to Tika 0.8
> ---
>
>
[
https://issues.apache.org/jira/browse/NUTCH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-692.
---
Closing all resolved issues with a non-fixed status.
> AlreadyBeingCreatedException with Hadoop 0.19
> --
[
https://issues.apache.org/jira/browse/NUTCH-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-733.
---
Closing all resolved issues with a non-fixed status.
> plain text view of cached files ignores HTML encod
[
https://issues.apache.org/jira/browse/NUTCH-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-778.
---
Closing all resolved issues with a non-fixed status.
> Running Nutch On linux having whoami exception?
>
[
https://issues.apache.org/jira/browse/NUTCH-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-736.
---
Closing all resolved issues with a non-fixed status.
> how long it takes nutch 1.0 to fetch
> ---
[
https://issues.apache.org/jira/browse/NUTCH-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-980.
-
Resolution: Fixed
Committed for trunk in rev. 1092062.
> Fix IllegalAccessError with slf4j used i
[
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-976:
Attachment: NUTCH-976-1.3-2.patch
NUTCH-976-trunk-2.patch
Patches for 1.3 and trunk.
[
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019749#comment-13019749
]
Markus Jelsma commented on NUTCH-976:
-
All seems to be alright now for trunk and 1.3, a
[
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019759#comment-13019759
]
Markus Jelsma commented on NUTCH-975:
-
Great stuff Julien! I'll also add the header for
[
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-975:
Attachment: NUTCH-975-trunk-bin.patch
Here's the patch for bin/nutch in trunk.
> Fix missing/wrong
[
https://issues.apache.org/jira/browse/NUTCH-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-975.
-
Resolution: Fixed
Assignee: Markus Jelsma
Everything builds and runs fine with these patches
[
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-976:
Description:
All Solr properties are now consistently using solr.* instead of solrindex.*.
This has
[
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-976.
-
Resolution: Fixed
Fixed in 1.3 in rev 1092084 and for trunk in rev 1092085. Thanks Julien for
com
[
https://issues.apache.org/jira/browse/NUTCH-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-977.
-
Resolution: Fixed
Committed for trunk in rev. 1092090 for 1.3 in rev. 1092091.
> SolrMappingReade
[
https://issues.apache.org/jira/browse/NUTCH-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-922.
---
Resolution: Not A Problem
No problem, unmapped fields are written anyway.
> SolrWriter should log sou
[
https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma reopened NUTCH-386:
-
This is one of the closed legacy issues. I reopened it so Richard can actually
attach the patch.
> P
[
https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-961:
Attachment: NUTCH-961-1.3-tikaparser.patch
BoilerpipeExtractorRepository.java
Here's
Parse-tika throws some URL's away
-
Key: NUTCH-984
URL: https://issues.apache.org/jira/browse/NUTCH-984
Project: Nutch
Issue Type: Bug
Components: parser
Affects Versions: 1.3, 2.0
Re
[
https://issues.apache.org/jira/browse/NUTCH-984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-984:
Description:
For some reason using parse-tika a crawl just wouldn't dive into some website
news arc
[
https://issues.apache.org/jira/browse/NUTCH-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021113#comment-13021113
]
Markus Jelsma commented on NUTCH-984:
-
Yes i can test these URL's with tika-parsers 0.9
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021712#comment-13021712
]
Markus Jelsma commented on NUTCH-985:
-
This is similar to another issue described today
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021739#comment-13021739
]
Markus Jelsma commented on NUTCH-985:
-
Yes, something has to be done. What did you atta
[
https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-961:
Attachment: (was: BoilerpipeExtractorRepository.java)
> Expose Tika's boilerpipe support
> -
[
https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-961:
Attachment: BoilerpipeExtractorRepository.java
Here's the correct file.
> Expose Tika's boilerpipe
Dedup fails due to date format (long)
-
Key: NUTCH-986
URL: https://issues.apache.org/jira/browse/NUTCH-986
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 1.3
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-985:
Affects Version/s: 1.3
> Problems indexing lastModifiedDate in Solr
> --
Support HTTP auth for Solr communication
Key: NUTCH-987
URL: https://issues.apache.org/jira/browse/NUTCH-987
Project: Nutch
Issue Type: Improvement
Components: indexer
Reporter:
[
https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025295#comment-13025295
]
Markus Jelsma commented on NUTCH-961:
-
Not safely, there are still issues regarding HTM
[
https://issues.apache.org/jira/browse/NUTCH-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021113#comment-13021113
]
Markus Jelsma edited comment on NUTCH-984 at 4/26/11 4:02 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-985:
Affects Version/s: 2.0
Fix Version/s: 2.0
1.3
Assignee: M
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-983:
Affects Version/s: 1.3
Fix Version/s: 1.3
> Upgrade SolrJ
> -
>
>
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-986:
Affects Version/s: 2.0
Fix Version/s: 2.0
Assignee: Markus Jelsma
> Dedup fails
index-feed plugin also doesn't use proper date fields
-
Key: NUTCH-988
URL: https://issues.apache.org/jira/browse/NUTCH-988
Project: Nutch
Issue Type: Improvement
Affects Versions: 1.3,
[
https://issues.apache.org/jira/browse/NUTCH-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-987:
Attachment: NUTCH-987-1.3-hack.patch
Attached nasty hack for the sake of not losing it.
> Support H
index-basic plugin also uses invalid date format for Solr
-
Key: NUTCH-989
URL: https://issues.apache.org/jira/browse/NUTCH-989
Project: Nutch
Issue Type: Improvement
Affects Versio
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-985:
Attachment: NUTCH-985.1.3-1.patch
Here's a working patch. It adds a date fieldtype to the schema and
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-985:
Summary: MoreIndexingFilter doesn't use properly formatted date fields for
Solr (was: Problems inde
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-989:
Summary: index-basic plugin doesn (was: index-basic plugin also uses
invalid date format for Solr)
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-989:
Description: The index-basic plugin actually sends over a properly
formatted date with millis but th
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-986:
Patch Info: [Patch Available]
> Dedup fails due to date format (long)
>
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-986:
Attachment: NUTCH-986-1.3-1.patch
Here's a patch. It leaves all code intact but only converts the in
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-986:
Attachment: NUTCH-986-trunk-1.patch
Patch for trunk!
> Dedup fails due to date format (long)
>
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-985:
Attachment: NUTCH-985-trunk-1.patch
Patch for trunk!
> MoreIndexingFilter doesn't use properly form
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025809#comment-13025809
]
Markus Jelsma commented on NUTCH-983:
-
Can someone take a look at this? Updating ivy fr
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025818#comment-13025818
]
Markus Jelsma commented on NUTCH-983:
-
SolrJ itself comes in nicely but it seems it com
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025820#comment-13025820
]
Markus Jelsma commented on NUTCH-990:
-
What does the log say? I tried my home_dir with
SolrDedup must issue a commit
-
Key: NUTCH-991
URL: https://issues.apache.org/jira/browse/NUTCH-991
Project: Nutch
Issue Type: Improvement
Components: indexer
Affects Versions: 1.3, 2.0
R
[
https://issues.apache.org/jira/browse/NUTCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma reassigned NUTCH-991:
---
Assignee: Markus Jelsma
> SolrDedup must issue a commit
> -
>
>
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025835#comment-13025835
]
Markus Jelsma commented on NUTCH-985:
-
You are right. But various index-* plugins write
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025837#comment-13025837
]
Markus Jelsma commented on NUTCH-983:
-
I can give it a try. What does the exclude exact
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025837#comment-13025837
]
Markus Jelsma edited comment on NUTCH-983 at 4/27/11 3:28 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-979:
Attachment: SolrClean.java
Here's a WIP in case i'll accidentally send it all to the litter bin. Thi
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025869#comment-13025869
]
Markus Jelsma commented on NUTCH-986:
-
If there are no objections i'll commit this one
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025956#comment-13025956
]
Markus Jelsma commented on NUTCH-990:
-
Could you post only the relevant parts of the lo
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-986.
-
Resolution: Fixed
Committed 1.3 in rev. 1097390 and for trunk in rev. 1097391.
> Dedup fails due
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-986:
Attachment: NUTCH-986-1.3-2.patch
NUTCH-986-trunk-2.patch
Previous patch was incorre
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026255#comment-13026255
]
Markus Jelsma commented on NUTCH-986:
-
Recommitted 1.3 in rev 1097410 and for trunk in
[
https://issues.apache.org/jira/browse/NUTCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-991:
Attachment: NUTCH-991-trunk-1.patch
NUTCH-991-1.3-1.patch
Added the commit operation
[
https://issues.apache.org/jira/browse/NUTCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-991.
-
Resolution: Fixed
Committed in 1.3 rev 1097415 and trunk 1097416.
> SolrDedup must issue a commit
SolrDedup is broken in trunk
Key: NUTCH-992
URL: https://issues.apache.org/jira/browse/NUTCH-992
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 2.0
Reporter: Markus
[
https://issues.apache.org/jira/browse/NUTCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-991:
Patch Info: [Patch Available]
> SolrDedup must issue a commit
> -
>
>
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027033#comment-13027033
]
Markus Jelsma commented on NUTCH-990:
-
guess we can mark it as won't fix then and close
[
https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-990.
-
Resolution: Won't Fix
> protocol-httpclient fails with short pages
> -
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma reassigned NUTCH-989:
---
Assignee: Markus Jelsma
> index-basic plugin doesn't use Solr date fieldType
> ---
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-989:
Fix Version/s: 1.3
The supplied Solr schema must use a date fieldType instead of long. If not,
dedu
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027597#comment-13027597
]
Markus Jelsma commented on NUTCH-983:
-
It works as expected in trunk but i can't seem t
[
https://issues.apache.org/jira/browse/NUTCH-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-710:
Fix Version/s: 2.0
Putting a useful issue back on the radar. Fix for 2.0?
> Support for rel="canoni
[
https://issues.apache.org/jira/browse/NUTCH-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-717:
Fix Version/s: 2.0
Back on the radar for 2.0?
> Make Nutch Solr integration easier
> -
[
https://issues.apache.org/jira/browse/NUTCH-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027604#comment-13027604
]
Markus Jelsma commented on NUTCH-783:
-
What's this? Shouldn't it be closed?
> IndexerC
[
https://issues.apache.org/jira/browse/NUTCH-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027613#comment-13027613
]
Markus Jelsma commented on NUTCH-783:
-
You're right. Shouldn't it be marked for a versi
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027597#comment-13027597
]
Markus Jelsma edited comment on NUTCH-983 at 5/2/11 12:54 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-987:
Fix Version/s: 2.0
> Support HTTP auth for Solr communication
>
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-989.
-
Resolution: Fixed
Date fieldType added and updated tstamp field to use the new fieldType.
Committ
[
https://issues.apache.org/jira/browse/NUTCH-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-989.
---
> index-basic plugin doesn't use Solr date fieldType
> --
Fine tune Solr schema
-
Key: NUTCH-994
URL: https://issues.apache.org/jira/browse/NUTCH-994
Project: Nutch
Issue Type: Improvement
Components: indexer
Affects Versions: 1.3, 2.0
Reporter: Markus
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029365#comment-13029365
]
Markus Jelsma commented on NUTCH-983:
-
That works indeed (seems the exclusions must not
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029399#comment-13029399
]
Markus Jelsma commented on NUTCH-983:
-
I'll take a look at it for trunk, hopefully tomo
[
https://issues.apache.org/jira/browse/NUTCH-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029470#comment-13029470
]
Markus Jelsma commented on NUTCH-983:
-
Great stuff! I just got my first svn conflict in
Indexer adds solr.commit.size+1 docs
Key: NUTCH-996
URL: https://issues.apache.org/jira/browse/NUTCH-996
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 1.3, 2.0
[
https://issues.apache.org/jira/browse/NUTCH-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030549#comment-13030549
]
Markus Jelsma commented on NUTCH-887:
-
Julien committed NUTCH-888 for 1.3 and trunk. I
[
https://issues.apache.org/jira/browse/NUTCH-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-977.
---
> SolrMappingReader uses hardcoded configuration parameter name for mapping file
> ---
[
https://issues.apache.org/jira/browse/NUTCH-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-991.
---
> SolrDedup must issue a commit
> -
>
> Key: NUTCH-991
>
[
https://issues.apache.org/jira/browse/NUTCH-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-980.
---
> Fix IllegalAccessError with slf4j used in Solrj.
>
>
>
[
https://issues.apache.org/jira/browse/NUTCH-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-912.
---
> MoreIndexingFilter does not parse docx and xlsx date formats
> -
[
https://issues.apache.org/jira/browse/NUTCH-963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-963.
---
> Add support for deleting Solr documents with STATUS_DB_GONE in CrawlDB (404
> urls)
> -
[
https://issues.apache.org/jira/browse/NUTCH-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-935.
---
> remove unnecessary /./ in basic urlnormalizer
> -
>
>
[
https://issues.apache.org/jira/browse/NUTCH-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-976.
---
> Rename properties solrindex.* to solr.*
>
>
>
[
https://issues.apache.org/jira/browse/NUTCH-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-986.
---
> Dedup fails due to date format (long)
> -
>
> Key: N
[
https://issues.apache.org/jira/browse/NUTCH-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-897.
---
> Subcollection requires blacklist element
>
>
>
[
https://issues.apache.org/jira/browse/NUTCH-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-964.
---
> ERROR conf.Configuration - Failed to set setXIncludeAware(true)
> --
[
https://issues.apache.org/jira/browse/NUTCH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030660#comment-13030660
]
Markus Jelsma commented on NUTCH-585:
-
Thanks for mentioning Wim. This patch can be use
[
https://issues.apache.org/jira/browse/NUTCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma reassigned NUTCH-994:
---
Assignee: Markus Jelsma
> Fine tune Solr schema
> -
>
> Ke
[
https://issues.apache.org/jira/browse/NUTCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-994:
Attachment: NUTCH-994-all.patch
This patches changes:
* non-analyzed field types to their Trie-based
[
https://issues.apache.org/jira/browse/NUTCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-994:
Patch Info: [Patch Available]
> Fine tune Solr schema
> -
>
> Ke
[
https://issues.apache.org/jira/browse/NUTCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030969#comment-13030969
]
Markus Jelsma edited comment on NUTCH-994 at 5/10/11 12:34 AM:
--
[
https://issues.apache.org/jira/browse/NUTCH-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030969#comment-13030969
]
Markus Jelsma edited comment on NUTCH-994 at 5/10/11 12:35 AM:
--
[
https://issues.apache.org/jira/browse/NUTCH-996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-996.
-
Resolution: Fixed
Committed for trunk in rev. 1101279 and for 1.3 in 1101280.
Commit.size might be
[
https://issues.apache.org/jira/browse/NUTCH-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031682#comment-13031682
]
Markus Jelsma commented on NUTCH-985:
-
>From dev@nutch
> For now a quick fix for the mo
[
https://issues.apache.org/jira/browse/NUTCH-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035297#comment-13035297
]
Markus Jelsma commented on NUTCH-997:
-
Good work, especially that it supercedes the oth
301 - 400 of 2804 matches
Mail list logo