[jira] Created: (NUTCH-435) Synonym-Editor that creates OWL for the ontology plugin

2007-01-26 Thread Urs Krebs (JIRA)
Synonym-Editor that creates OWL for the ontology plugin --- Key: NUTCH-435 URL: https://issues.apache.org/jira/browse/NUTCH-435 Project: Nutch Issue Type: New Feature Reporter:

[jira] Created: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty

2007-01-26 Thread Andrew Groh (JIRA)
Incorrect handling of relative paths when the embedded URL path is empty Key: NUTCH-436 URL: https://issues.apache.org/jira/browse/NUTCH-436 Project: Nutch Issue Type:

[jira] Updated: (NUTCH-435) Synonym-Editor that creates OWL for the ontology plugin

2007-01-26 Thread Urs Krebs (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Urs Krebs updated NUTCH-435: Attachment: SynonymEditor-0.9.zip Synonym-Editor that creates OWL for the ontology plugin

[jira] Updated: (NUTCH-436) Incorrect handling of relative paths when the embedded URL path is empty

2007-01-26 Thread Andrew Groh (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Groh updated NUTCH-436: -- Description: If you have a base URL of the form: http://a/b/c/d;p?q#f Embedded URL: ?y Correct

RE: parse-rss make them items as different pages

2007-01-26 Thread Gal Nitzan
Hi Kauu, The functionality you require doesn't exist in the current parse-rss plugin. I need the same functionality but it doesn't exist and I believe it's not a simple task. The functionality required basically is to create a page in a segment for each item and the URL to the crawldb. Since

record version mismatch occured

2007-01-26 Thread Gal Nitzan
Trying to mergesegs I get the following, any idea? A record version mismatch occured. Expecting v4, found v5 at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:147) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1 175) at

[jira] Assigned: (NUTCH-431) Move plugin specific properties out of nutch-site.xml and into specific conf files for plugins

2007-01-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned NUTCH-431: --- Assignee: Chris A. Mattmann Move plugin specific properties out of nutch-site.xml

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-01-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467887 ] Chris A. Mattmann commented on NUTCH-258: - Guys, From recent conversations on the mailing list where Doug

Re: record version mismatch occured

2007-01-26 Thread Sami Siren
Gal Nitzan wrote: Got it. I used latest trunk for a few hours and it seems that it changed the version of Crawldatum to ver. 5 :( yes, version is updated on write

Re: record version mismatch occured

2007-01-26 Thread Sami Siren
Gal Nitzan wrote: Got it. I used latest trunk for a few hours and it seems that it changed the version of Crawldatum to ver. 5 :( Earlier one left too early, one(ore more) of your segments has data written with newer version. If you haven't updated crawldb then you just need to redo that(those)

RE: record version mismatch occured

2007-01-26 Thread Gal Nitzan
Thanks Sami, By redo do you mean re-parse or re-fetch + re-parse -Original Message- From: Sami Siren [mailto:[EMAIL PROTECTED] Sent: Friday, January 26, 2007 10:49 PM To: nutch-dev@lucene.apache.org Subject: Re: record version mismatch occured Gal Nitzan wrote: Got it. I used latest

[jira] Commented: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable

2007-01-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467912 ] Sami Siren commented on NUTCH-434: -- It's only half way if we get the Configuration into our subclass, there's no

Re: record version mismatch occured

2007-01-26 Thread Sami Siren
Gal Nitzan wrote: Thanks Sami, By redo do you mean re-parse or re-fetch + re-parse generate - fetch - parse -- Sami Siren

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-01-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467916 ] Sami Siren commented on NUTCH-258: -- I haven't noticed this being a problem for me, so no objections from here.

java.io.FileNotFoundException: / (Is a directory)

2007-01-26 Thread Gal Nitzan
Just installed latest from trunk. I run mergesegs and I get the following error in all tasks log files (I use default log4j.properties): log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: / (Is a directory) at java.io.FileOutputStream.openAppend(Native Method)

[jira] Commented: (NUTCH-434) Replace usage of ObjectWritable with something based on GenericWritable

2007-01-26 Thread Sami Siren (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467927 ] Sami Siren commented on NUTCH-434: -- I can see the light, overriding readFields is sufficient. Replace usage of

[jira] Commented: (NUTCH-258) Once Nutch logs a SEVERE log item, Nutch fails forevermore

2007-01-26 Thread Scott Ganyo (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467931 ] Scott Ganyo commented on NUTCH-258: --- Chris, I originally opened the issue... but unfortunately I can neither

Trunk version and NUTCH-251(Administration gui)

2007-01-26 Thread karthik085
I would like to use Nutch-251(administration gui). How stable is it? How easy it is to setup and make it work with nutch? Since it seems to work with the trunk version, how stable is trunk version of nutch? Any tentative schedule, when nutch next release will be? Will it include admin gui? I

Re: parse-rss make them items as different pages

2007-01-26 Thread kauu
that's the right thing. i think we should to do some thing when nutch fetch a page successfully, judge if a rss then create as many pages as the items' number.i don't know whether it work. In the other hand , we can do some thing in the segment just like what u say . i don't know that

Re: parse-rss make them items as different pages

2007-01-26 Thread kauu
who can tell me where and how to build a nutch document in nutch-0.8.1? for example , one html page is a document , but i want to detach a document to several ones . On 1/27/07, kauu [EMAIL PROTECTED] wrote: that's the right thing. i think we should to do some thing when nutch fetch a page

Re: parse-rss make them items as different pages

2007-01-26 Thread sishen
On 1/26/07, Gal Nitzan [EMAIL PROTECTED] wrote: Hi Kauu, The functionality you require doesn't exist in the current parse-rss plugin. I need the same functionality but it doesn't exist and I believe it's not a simple task. The functionality required basically is to create a page in a segment

Re: parse-rss make them items as different pages

2007-01-26 Thread kauu
that's right ,but in the other word , i just need to index the exact information in a page .but in real ,the real world pages contain lots of spam ,so i just want to index the description. On 1/27/07, sishen [EMAIL PROTECTED] wrote: On 1/26/07, Gal Nitzan [EMAIL PROTECTED] wrote: Hi Kauu,