[jira] [Commented] (NUTCH-1445) Add ElasticIndexerJob that indexes to elasticsearch

2012-08-06 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429058#comment-13429058 ] Julien Nioche commented on NUTCH-1445: -- Ferdy - just to reiterate what was said on a

[jira] [Resolved] (NUTCH-1159) Write JUnit tests for index-anchor

2012-08-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1159. - Resolution: Fixed Assignee: Lewis John McGibbney Committed @revision

[jira] [Commented] (NUTCH-1151) Index-anchor to add numInlinks count

2012-08-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429164#comment-13429164 ] Lewis John McGibbney commented on NUTCH-1151: - Hi Markus, I am happy for this

[jira] [Commented] (NUTCH-1151) Index-anchor to add numInlinks count

2012-08-06 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429165#comment-13429165 ] Lewis John McGibbney commented on NUTCH-1151: - I can also add to 2.1 unless

[jira] [Created] (NUTCH-1446) Port NUTCH-1444 to trunk (Indexing should not create temporary files)

2012-08-06 Thread Ferdy Galema (JIRA)
Ferdy Galema created NUTCH-1446: --- Summary: Port NUTCH-1444 to trunk (Indexing should not create temporary files) Key: NUTCH-1446 URL: https://issues.apache.org/jira/browse/NUTCH-1446 Project: Nutch

Re: Understanding mapping of field characteristics to index structure

2012-08-06 Thread Lewis John Mcgibbney
Mmmm... I think I opened a small can of worms here regarding consistency between schema.xml and schema-solr4.xml. There are discrepancies between some fields as to their structural characteristics. This is something which I think we should make consistent between schemas... no? An example would

RE: Understanding mapping of field characteristics to index structure

2012-08-06 Thread Markus Jelsma
Hi, Tokenization depens whether an analyzer used for the field (non-primitive types) and the tokenization depends on which tokenizer is defined. Tokenizing a hostname doesn't really make sense with the default available tokenizers but you can use a KeywordTokenizer with a WordDelmiterFilter to

Build failed in Jenkins: Nutch-trunk #1920

2012-08-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/1920/ -- Started by timer Building remotely on solaris1 in workspace https://builds.apache.org/job/Nutch-trunk/ws/ hudson.util.IOException2: remote file operation failed: