[jira] [Created] (NUTCH-1888) Specify HTMLMapper to use in TikaParser

2014-11-07 Thread Julien Nioche (JIRA)
Julien Nioche created NUTCH-1888: Summary: Specify HTMLMapper to use in TikaParser Key: NUTCH-1888 URL: https://issues.apache.org/jira/browse/NUTCH-1888 Project: Nutch Issue Type:

[jira] [Resolved] (NUTCH-1887) Specify HTMLMapper to use in TikaParser

2014-11-07 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche resolved NUTCH-1887. -- Resolution: Fixed thanks. Committed in revision 1637325. Opened [NUTCH-1888] for 2.x port

[jira] [Created] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata

2014-11-07 Thread Julien Nioche (JIRA)
Julien Nioche created NUTCH-1889: Summary: Store all values from Tika metadata in Nutch metadata Key: NUTCH-1889 URL: https://issues.apache.org/jira/browse/NUTCH-1889 Project: Nutch Issue

[jira] [Updated] (NUTCH-1889) Store all values from Tika metadata in Nutch metadata

2014-11-07 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1889: - Attachment: NUTCH-1889.patch Store all values from Tika metadata in Nutch metadata

[jira] [Commented] (NUTCH-1887) Specify HTMLMapper to use in TikaParser

2014-11-07 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201916#comment-14201916 ] Hudson commented on NUTCH-1887: --- SUCCESS: Integrated in Nutch-trunk #2853 (See

[jira] [Updated] (NUTCH-1592) XPath works on documents parsed with parse-html but not parse-tika

2014-11-07 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1592: - Attachment: NUTCH-1592.patch Can force DOM element names to be uppercased by parse-tika XPath

[jira] [Commented] (NUTCH-1883) bin/crawl: use function to run bin/nutch and check exit value

2014-11-07 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202103#comment-14202103 ] Julien Nioche commented on NUTCH-1883: -- The Generator returns 1 when there aren't any

[jira] [Updated] (NUTCH-1140) index-more plugin, resetTitle method creates multiple values in the Title field

2014-11-07 Thread kaveh minooie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kaveh minooie updated NUTCH-1140: - Attachment: 0001-NUTCH-1140-trunk.patch 0001-NUTCH-1140-2.x.patch so this is