[jira] [Updated] (NUTCH-956) solrindex issues

2011-07-12 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-956: - Attachment: solr.patch2 - NPE related to content-type field - tld field in Solr schema - string comparison in

[jira] Updated: (NUTCH-965) Skip parsing for truncated documents

2011-02-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-965: - Summary: Skip parsing for truncated documents (was: Parsing takes up 100% CPU) Skip parsing for truncated

[jira] Updated: (NUTCH-965) Parsing takes up 100% CPU

2011-02-08 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-965: - Attachment: parserJob.patch In the parser mapper, compare Content-Length header to the size of the content

[jira] Commented: (NUTCH-955) Ivy configuration

2011-01-18 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983125#action_12983125 ] Alexis commented on NUTCH-955: -- Sorry please disregard the nutch.root first bullet in the

[jira] Created: (NUTCH-956) soldindex issues

2011-01-13 Thread Alexis (JIRA)
soldindex issues Key: NUTCH-956 URL: https://issues.apache.org/jira/browse/NUTCH-956 Project: Nutch Issue Type: Bug Components: indexer Affects Versions: 2.0 Reporter: Alexis I ran into a few

[jira] Updated: (NUTCH-956) soldindex issues

2011-01-13 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-956: - Attachment: solr.patch Here are the changes: - Avoid multiple values for id field. (NUTCH-819) - Allow multiple

[jira] Created: (NUTCH-955) Ivy configuration

2011-01-10 Thread Alexis (JIRA)
Ivy configuration - Key: NUTCH-955 URL: https://issues.apache.org/jira/browse/NUTCH-955 Project: Nutch Issue Type: Improvement Components: build Affects Versions: 2.0 Reporter: Alexis As mentioned

[jira] Updated: (NUTCH-955) Ivy configuration

2011-01-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-955: - Attachment: ivy.patch In the patch, the required dependencies for MySQL and HBase are included in the Ivy

[jira] Created: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-01 Thread Alexis (JIRA)
Content-Length limit, URL filter and few minor issues - Key: NUTCH-950 URL: https://issues.apache.org/jira/browse/NUTCH-950 Project: Nutch Issue Type: Bug Affects Versions: 2.0

[jira] Updated: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-01 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-950: - Attachment: nutch4.patch Content-Length limit, URL filter and few minor issues

[jira] Updated: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1

2010-12-18 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-899: - Attachment: httpContentLimit.patch We stick with the default gora schema for the MySQL backend, which says

[jira] Commented: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1

2010-12-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970336#action_12970336 ] Alexis commented on NUTCH-899: -- I ran into the exact same issue, with MySQL. The blob column