[jira] Created: (NUTCH-743) Site search powered by Lucene/Solr

2009-06-23 Thread Sami Siren (JIRA)
Site search powered by Lucene/Solr
--

 Key: NUTCH-743
 URL: https://issues.apache.org/jira/browse/NUTCH-743
 Project: Nutch
  Issue Type: New Feature
  Components: documentation
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor


Replace current Nutch site search with Lucene/Solr powered search hosted by 
Lucid Imagination (http://www.lucidimagination.com/search).  It allows one to 
search all of the Nutch (content from other parts of the Lucene ecosystem is 
also available) content from a single place, including web, wiki, JIRA and mail 
archives. Lucid has a fault tolerant setup with replication and fail over as 
well as monitoring services in place. 

A preview of the site with the new search enabled is available at 
http://people.apache.org/~siren/site/


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (NUTCH-743) Site search powered by Lucene/Solr

2009-06-23 Thread Sami Siren (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated NUTCH-743:
-

Attachment: NUTCH-743.patch

If there are no objections I will commit this within a week or so.

 Site search powered by Lucene/Solr
 --

 Key: NUTCH-743
 URL: https://issues.apache.org/jira/browse/NUTCH-743
 Project: Nutch
  Issue Type: New Feature
  Components: documentation
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Attachments: NUTCH-743.patch


 Replace current Nutch site search with Lucene/Solr powered search hosted by 
 Lucid Imagination (http://www.lucidimagination.com/search).  It allows one to 
 search all of the Nutch (content from other parts of the Lucene ecosystem is 
 also available) content from a single place, including web, wiki, JIRA and 
 mail archives. Lucid has a fault tolerant setup with replication and fail 
 over as well as monitoring services in place. 
 A preview of the site with the new search enabled is available at 
 http://people.apache.org/~siren/site/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-743) Site search powered by Lucene/Solr

2009-06-23 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723176#action_12723176
 ] 

Andrzej Bialecki  commented on NUTCH-743:
-

+1, based on the outcome of a thorough discussion of pros/cons of the same 
subject on the Lucene lists.

 Site search powered by Lucene/Solr
 --

 Key: NUTCH-743
 URL: https://issues.apache.org/jira/browse/NUTCH-743
 Project: Nutch
  Issue Type: New Feature
  Components: documentation
Reporter: Sami Siren
Assignee: Sami Siren
Priority: Minor
 Attachments: NUTCH-743.patch


 Replace current Nutch site search with Lucene/Solr powered search hosted by 
 Lucid Imagination (http://www.lucidimagination.com/search).  It allows one to 
 search all of the Nutch (content from other parts of the Lucene ecosystem is 
 also available) content from a single place, including web, wiki, JIRA and 
 mail archives. Lucid has a fault tolerant setup with replication and fail 
 over as well as monitoring services in place. 
 A preview of the site with the new search enabled is available at 
 http://people.apache.org/~siren/site/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Per-host fetch-interval

2009-06-23 Thread Sandeep Tata
Hi,

I was wondering what would be the best way to configure per-host
re-crawl intervals. The default db.fetch.interval applies to all URLs,
but I'd like for some hosts to be recrawled more frequently. Is there
a JIRA ticket open on this? I haven't been able to find one

Sandeep


[jira] Commented: (NUTCH-729) NPE in FieldIndexer when BasicFields url doesn't exist

2009-06-23 Thread Tadesse Sefer (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723286#action_12723286
 ] 

Tadesse Sefer commented on NUTCH-729:
-

Where do you change the logging to use a url key?

 NPE in FieldIndexer when BasicFields url doesn't exist
 --

 Key: NUTCH-729
 URL: https://issues.apache.org/jira/browse/NUTCH-729
 Project: Nutch
  Issue Type: Bug
  Components: indexer
Affects Versions: 0.9.0, 1.0.0
 Environment: All
Reporter: Dennis Kubes
Assignee: Dennis Kubes
 Fix For: 1.1

 Attachments: NUTCH-729-1-20090235.patch


 There is a NullPointerException during a logging call in FieldIndexer when 
 there isn't a url for a document.  Documents shouldn't be without urls but 
 since the FieldIndexer doesn't validate fields it is possible for it to 
 occur.  Most often this happens when BasicFields is run with the wrong 
 segments directory and doesn't complain.  It could also occur if using the 
 FieldIndexer to index things other than basic fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Nutch-trunk #854

2009-06-23 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/854/

--
[...truncated 4676 lines...]

deploy:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-regex
 
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-regex
 

copy-generated-lib:
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-regex
 

init:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/test
 

init-plugin:

deps-jar:

compile:
 [echo] Compiling plugin: urlfilter-suffix
[javac] Compiling 1 source file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes
 
[javac] Note: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java
  uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

jar:
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/urlfilter-suffix.jar
 

deps-test:

deploy:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix
 
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix
 

copy-generated-lib:
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix
 

init:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/test
 

init-plugin:

deps-jar:

compile:
 [echo] Compiling plugin: urlfilter-validator
[javac] Compiling 1 source file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes
 

jar:
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/urlfilter-validator.jar
 

deps-test:

deploy:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator
 
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator
 

copy-generated-lib:
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator
 

init:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/test
 

init-plugin:

deps-jar:

compile:
 [echo] Compiling plugin: urlnormalizer-basic
[javac] Compiling 1 source file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes
 

jar:
  [jar] Building jar: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/urlnormalizer-basic.jar
 

deps-test:

deploy:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic
 
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic
 

copy-generated-lib:
 [copy] Copying 1 file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic
 

init:
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes
 
[mkdir] Created dir: 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/test
 

init-plugin:

deps-jar:

compile:
 [echo] Compiling plugin: urlnormalizer-pass
[javac] Compiling 1 source file to 
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes
 

jar:
  [jar] Building jar: