Julien Nioche created NUTCH-1888:
Summary: Specify HTMLMapper to use in TikaParser
Key: NUTCH-1888
URL: https://issues.apache.org/jira/browse/NUTCH-1888
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-1887.
--
Resolution: Fixed
thanks. Committed in revision 1637325.
Opened [NUTCH-1888] for 2.x port
Julien Nioche created NUTCH-1889:
Summary: Store all values from Tika metadata in Nutch metadata
Key: NUTCH-1889
URL: https://issues.apache.org/jira/browse/NUTCH-1889
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1889:
-
Attachment: NUTCH-1889.patch
Store all values from Tika metadata in Nutch metadata
[
https://issues.apache.org/jira/browse/NUTCH-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201916#comment-14201916
]
Hudson commented on NUTCH-1887:
---
SUCCESS: Integrated in Nutch-trunk #2853 (See
[
https://issues.apache.org/jira/browse/NUTCH-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1592:
-
Attachment: NUTCH-1592.patch
Can force DOM element names to be uppercased by parse-tika
XPath
[
https://issues.apache.org/jira/browse/NUTCH-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202103#comment-14202103
]
Julien Nioche commented on NUTCH-1883:
--
The Generator returns 1 when there aren't any
[
https://issues.apache.org/jira/browse/NUTCH-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
kaveh minooie updated NUTCH-1140:
-
Attachment: 0001-NUTCH-1140-trunk.patch
0001-NUTCH-1140-2.x.patch
so this is
8 matches
Mail list logo