Re: RSS-fecter and index individul-how can i realize this function

2009-01-05 Thread Doğacan Güney
On Mon, Jan 5, 2009 at 7:00 AM, Vlad Cananau vlad...@gmail.com wrote: Hello I'm trying to make RSSParser do something simmilar to FeedParser (which doesn't work quite right) - that is, instead of indexing the whole contents Why doesn't FeedParser work? Let's fix whatever is broken in it :D

Re: RSS-fecter and index individul-how can i realize this function

2009-01-05 Thread Vlad Cananau
On Mon, Jan 5, 2009 at 12:32 PM, Doğacan Güney doga...@gmail.com wrote: On Mon, Jan 5, 2009 at 7:00 AM, Vlad Cananau vlad...@gmail.com wrote: Hello I'm trying to make RSSParser do something simmilar to FeedParser (which doesn't work quite right) - that is, instead of indexing the whole

Re: RSS-fecter and index individul-how can i realize this function

2009-01-05 Thread Vlad Cananau
Doğacan Güney wrote: On Mon, Jan 5, 2009 at 7:00 AM, Vlad Cananau vlad...@gmail.com wrote: Hello I'm trying to make RSSParser do something simmilar to FeedParser (which doesn't work quite right) - that is, instead of indexing the whole contents Why doesn't FeedParser work? Let's fix

Re: RSS-fecter and index individul-how can i realize this function

2009-01-04 Thread Vlad Cananau
Hello I'm trying to make RSSParser do something simmilar to FeedParser (which doesn't work quite right) - that is, instead of indexing the whole contents of the feed, I want it to show individual items, with their respective title and and proper link to the article I realize that I could index 1

Re: RSS-fecter and index individul-how can i realize this function

2008-12-03 Thread mirkes
-fecter-and-index-individul-how-can-i-realize-this-function-tp8722009p20815016.html Sent from the Nutch - Dev mailing list archive at Nabble.com.

RE: RSS-fecter and index individul-how can i realize this function

2007-02-08 Thread Alan Tanaman
2. It sounds like a pretty fundamental API shift in Nutch, to support a single type of content, RSS. Even if there are more content types that follow this model, as Doug and Renaud both pointed out, there aren't a multitude of them (perhaps archive files, but can you think of any others)?

Re: RSS-fecter and index individul-how can i realize this function

2007-02-08 Thread Chris Mattmann
Hi Doug, Okay, I see your points. It seems like this would be really useful for some current folks, and for Nutch going forward. I see that there has been some initial work today and preparing patches. I'd be happy to shepherd this into the sources. I will begin reviewing what's required, and

FW: RSS-fecter and index individul-how can i realize this function

2007-02-08 Thread HUYLEBROECK Jeremy RD-ILAB-SSF
[mailto:[EMAIL PROTECTED] Sent: Friday, February 02, 2007 10:19 AM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Attention, votre correspondant continue de vous écrire à votre ancienne adresse en @orange-ft.com, qui va être désactivée début

Re: FW: RSS-fecter and index individul-how can i realize this function

2007-02-08 Thread Renaud Richardet
02, 2007 10:19 AM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Attention, votre correspondant continue de vous écrire à votre ancienne adresse en @orange-ft.com, qui va être désactivée début avril. Veuillez lui demander de mettre à

Re: RSS-fecter and index individul-how can i realize this function

2007-02-08 Thread sdeck
. Doug -- View this message in context: http://www.nabble.com/RSS-fecter-and-index-individul-how-can-i-realize-this-function-tf3146271.html#a8876127 Sent from the Nutch - Dev mailing list archive at Nabble.com.

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Doug Cutting
Renaud Richardet wrote: I see. I was thinking that I could index the feed items without having to fetch them individually. Okay, so if Parser#parse returned a MapString,Parse, then the URL for each parse should be that of its link, since you don't want to fetch that separately. Right? So

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Chris Mattmann
Guys, Sorry to be so thick-headed, but could someone explain to me in really simple language what this change is requesting that is different from the current Nutch API? I still don't get it, sorry... Cheers, Chris On 2/7/07 9:58 AM, Doug Cutting [EMAIL PROTECTED] wrote: Renaud Richardet

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Renaud Richardet
Doug Cutting wrote: Renaud Richardet wrote: I see. I was thinking that I could index the feed items without having to fetch them individually. Okay, so if Parser#parse returned a MapString,Parse, then the URL for each parse should be that of its link, since you don't want to fetch that

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Doug Cutting
Chris Mattmann wrote: Sorry to be so thick-headed, but could someone explain to me in really simple language what this change is requesting that is different from the current Nutch API? I still don't get it, sorry... A Content would no longer generate a single Parse. Instead, a Content

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Sami Siren
Also true. On the other hand, Nutch provides 98% of an RSS search engine. It'd be a shame to have to re-invent everything else and it would be great if Nutch could evolve to support RSS well. Could image search might also benefit from this? One could generate a Parse for each image on a

Re: RSS-fecter and index individul-how can i realize this function

2007-02-07 Thread Doğacan Güney
Renaud Richardet wrote: Doug Cutting wrote: Renaud Richardet wrote: I see. I was thinking that I could index the feed items without having to fetch them individually. Okay, so if Parser#parse returned a MapString,Parse, then the URL for each parse should be that of its link, since you don't

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Doğacan Güney
Hi, Doug Cutting wrote: Doğacan Güney wrote: I think it would make much more sense to change parse plugins to take content and return Parse[] instead of Parse. You're right. That does make more sense. OK, then should I go forward with this and implement something? This should be pretty

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Gal Nitzan
[EMAIL PROTECTED] To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi, Doug Cutting wrote: Doğacan Güney wrote: I think it would make much more sense to change parse plugins to take content and return Parse[] instead of Parse

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Doug Cutting
Doğacan Güney wrote: OK, then should I go forward with this and implement something? This should be pretty easy, though I am not sure what to give as keys to a Parse[]. I mean, when getParse returned a single Parse, ParseSegment output them as url, Parse. But, if getParse returns an array,

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Chris Mattmann
Hi Doug, Since the target of the link must still be indexed separately from the item itself, how much use is all this? If the RSS document is considered a single page that changes frequently, and item's links are considered ordinary outlinks, isn't much the same effect achieved? IMHO, yes.

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Renaud Richardet
Hi Chris, Doug, Chris Mattmann wrote: Hi Doug, Since the target of the link must still be indexed separately from the item itself, how much use is all this? If the RSS document is considered a single page that changes frequently, and item's links are considered ordinary outlinks, isn't

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Doug Cutting
Renaud Richardet wrote: The usecase is that you index RSS-feeds, but your users can search each feed-entry as a single document. Does it makes sense? But each feed item also contains a link whose content will be indexed and that's generally a superset of the item. So should there be two

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Renaud Richardet
Doug Cutting wrote: Renaud Richardet wrote: The usecase is that you index RSS-feeds, but your users can search each feed-entry as a single document. Does it makes sense? But each feed item also contains a link whose content will be indexed and that's generally a superset of the item.

Re: RSS-fecter and index individul-how can i realize this function

2007-02-06 Thread Doğacan Güney
Renaud Richardet wrote: Doug Cutting wrote: Renaud Richardet wrote: The usecase is that you index RSS-feeds, but your users can search each feed-entry as a single document. Does it makes sense? But each feed item also contains a link whose content will be indexed and that's generally a

Re: RSS-fecter and index individul-how can i realize this function

2007-02-05 Thread Doğacan Güney
Doug Cutting wrote: Gal Nitzan wrote: IMHO the data that is needed i.e. the data that will be fetched in the next fetch process is already available in the item element. Each item element represents one web resource. And there is no reason to go to the server and re-fetch that resource.

Re: RSS-fecter and index individul-how can i realize this function

2007-02-05 Thread Doug Cutting
Doğacan Güney wrote: I think it would make much more sense to change parse plugins to take content and return Parse[] instead of Parse. You're right. That does make more sense. Doug

Re: RSS-fecter and index individul-how can i realize this function

2007-02-04 Thread kauu
: Wednesday, January 31, 2007 8:44 AM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi there, With the explanation that you give below, it seems like parse-rss as it exists would address what you are trying to do. parse-rss parses

Re: RSS-fecter and index individul-how can i realize this function

2007-02-02 Thread Renaud Richardet
[mailto:[EMAIL PROTECTED] Sent: Thursday, February 01, 2007 7:01 PM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi Gal, et al., I'd like to be explicit when we talk about what the issue with the RSS parsing plugin is here; I think

Re: RSS-fecter and index individul-how can i realize this function

2007-02-02 Thread Doug Cutting
Gal Nitzan wrote: IMHO the data that is needed i.e. the data that will be fetched in the next fetch process is already available in the item element. Each item element represents one web resource. And there is no reason to go to the server and re-fetch that resource. Perhaps ProtocolOutput

Re: RSS-fecter and index individul-how can i realize this function

2007-02-01 Thread Chris Mattmann
@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi there, With the explanation that you give below, it seems like parse-rss as it exists would address what you are trying to do. parse-rss parses an RSS channel as a set of items, and indexes

RE: RSS-fecter and index individul-how can i realize this function

2007-02-01 Thread Gal Nitzan
Mattmann [mailto:[EMAIL PROTECTED] Sent: Thursday, February 01, 2007 7:01 PM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi Gal, et al., I'd like to be explicit when we talk about what the issue with the RSS parsing plugin is here; I

Re: RSS-fecter and index individul-how can i realize this function

2007-02-01 Thread kauu
AM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi there, With the explanation that you give below, it seems like parse-rss as it exists would address what you are trying to do. parse-rss parses an RSS channel as a set of items

Re: RSS-fecter and index individul-how can i realize this function

2007-01-31 Thread kauu
hi , thx any way , but i don't think I tell clearly enough. what i want is nutch just fetch rss seeds for 1 depth. So nutch should just fetch some xml pages .I don't want to fetch the items' outlink 's pages, because there r too much spam in those pages. so , i just need to parse the rss

RE: RSS-fecter and index individul-how can i realize this function

2007-01-31 Thread Gal Nitzan
] Sent: Wednesday, January 31, 2007 8:44 AM To: nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi there, With the explanation that you give below, it seems like parse-rss as it exists would address what you are trying to do. parse-rss parses

RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread kauu
Hi folks : What’s I want to do is to separate a rss file into several pages . Just as what has been discussed before. I want fetch a rss page and index it as different documents in the index. So the searcher can search the Item’s info as a individual hit. What’s my opinion create a

Re: RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread Chris Mattmann
Hi there, I could most likely be of assistance, if you gave me some more information. For instance: I'm wondering if the use case you describe below is already supported by the current RSS parse plugin? The current RSS parser, parse-rss, does in fact index individual items that are pointed to

Re: RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread kauu
thx for ur reply . mybe i didn't tell clearly . I want to index the item as a individual page .then when i search the some thing for example nutch-open source, the nutch return a hit which contain title : nutch-open source description : nutch nutch nutch nutch nutch url :

Re: RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread Chris Mattmann
: RSS-fecter and index individul-how can i realize this function Hi there, I could most likely be of assistance, if you gave me some more information. For instance: I'm wondering if the use case you describe below is already supported by the current RSS parse plugin? The current RSS

Re: RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread pdecrem
- From: Chris Mattmann [EMAIL PROTECTED] Date: Tue, 30 Jan 2007 19:34:49 To:nutch-dev@lucene.apache.org Subject: Re: RSS-fecter and index individul-how can i realize this function Hi there, On 1/30/07 7:00 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Chris, I saw your name associated

Re: RSS-fecter and index individul-how can i realize this function

2007-01-30 Thread Chris Mattmann
Hi there, With the explanation that you give below, it seems like parse-rss as it exists would address what you are trying to do. parse-rss parses an RSS channel as a set of items, and indexes overall metadata about the RSS file, including parse text, and index data, but it also adds each item