Tien Nguyen Manh created NUTCH-1702:
---
Summary: Port HostNormalizer to 2.x
Key: NUTCH-1702
URL: https://issues.apache.org/jira/browse/NUTCH-1702
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1702:
Attachment: NUTCH-1702.patch
Port HostNormalizer to 2.x
--
[
https://issues.apache.org/jira/browse/NUTCH-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1702:
Fix Version/s: 2.3
Port HostNormalizer to 2.x
--
[
https://issues.apache.org/jira/browse/NUTCH-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1702:
Attachment: NUTCH-1702.patch
Port HostNormalizer to 2.x
--
[
https://issues.apache.org/jira/browse/NUTCH-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1702:
Attachment: (was: NUTCH-1702.patch)
Port HostNormalizer to 2.x
Canan Girgin created NUTCH-1703:
---
Summary: Nutch ignores alt text of images
Key: NUTCH-1703
URL: https://issues.apache.org/jira/browse/NUTCH-1703
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Canan Girgin updated NUTCH-1703:
Attachment: NUTCH_1703.patch
Nutch ignores alt text of images
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1703:
-
Fix Version/s: 1.8
Nutch ignores alt text of images
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871904#comment-13871904
]
Markus Jelsma commented on NUTCH-1703:
--
Can you provide a test for
[
https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-1568.
-
Resolution: Fixed
Committed @revision 1558349 in 2.x
[~talat], thank you for
[
https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1655:
Attachment: NUTCH-1655-v3.patch
Updated patch to correct formatting in confi
[
https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871998#comment-13871998
]
Markus Jelsma commented on NUTCH-1655:
--
Hi i haven't read the code but incorporating
[
https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872009#comment-13872009
]
Talat UYARER commented on NUTCH-1655:
-
Hi [~markus17],
I have already included
[
https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872010#comment-13872010
]
Hudson commented on NUTCH-1568:
---
SUCCESS: Integrated in Nutch-nutchgora #887 (See
[
https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872015#comment-13872015
]
Markus Jelsma commented on NUTCH-1655:
--
Nice :)
Indexer Plugin for Elastic Search
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872106#comment-13872106
]
Canan Girgin edited comment on NUTCH-1703 at 1/15/14 2:18 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872106#comment-13872106
]
Canan Girgin commented on NUTCH-1703:
-
ok. A new patch Patch had been added which
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Canan Girgin updated NUTCH-1703:
Attachment: NUTCH_1703_v2.patch
Nutch ignores alt text of images
[
https://issues.apache.org/jira/browse/NUTCH-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872116#comment-13872116
]
Markus Jelsma commented on NUTCH-1703:
--
How is this patch made? I cannot patch the
[
https://issues.apache.org/jira/browse/NUTCH-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872137#comment-13872137
]
Lewis John McGibbney commented on NUTCH-1701:
-
Configurable sounds good. I'm
[
https://issues.apache.org/jira/browse/NUTCH-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1704:
Attachment: NUTCH-1704.patch
Port DomainBlacklist urlfilter to 2.x
[
https://issues.apache.org/jira/browse/NUTCH-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1699:
Attachment: NUTCH-1699v2-2.x.patch
Patch for 2.x
Tika Parser - Image Parse Bug
[
https://issues.apache.org/jira/browse/NUTCH-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yasin Kılınç updated NUTCH-1662:
Attachment: NUTCH-1662.patch
I create indexer plugin of SolrCloud. This patch can apply after
[
https://issues.apache.org/jira/browse/NUTCH-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1478:
Attachment: NUTCH-1478-parse-v2.patch
i port parse-metatags to 2.x, this patch support
[
https://issues.apache.org/jira/browse/NUTCH-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872198#comment-13872198
]
Hudson commented on NUTCH-1699:
---
SUCCESS: Integrated in Nutch-nutchgora #888 (See
[
https://issues.apache.org/jira/browse/NUTCH-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872199#comment-13872199
]
Alparslan Avcı commented on NUTCH-1674:
---
Hi [~memnoh], the patch is prepared for 2.x
[
https://issues.apache.org/jira/browse/NUTCH-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872205#comment-13872205
]
Hudson commented on NUTCH-1699:
---
SUCCESS: Integrated in Nutch-trunk #2491 (See
[
https://issues.apache.org/jira/browse/NUTCH-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tien Nguyen Manh updated NUTCH-1705:
Attachment: NUTCH-1705.patch
Make configuration option for HtmlParser TikaParser to
Tien Nguyen Manh created NUTCH-1705:
---
Summary: Make configuration option for HtmlParser TikaParser to
extract text or title for noIndex page
Key: NUTCH-1705
URL:
29 matches
Mail list logo