HUYLEBROECK Jeremy RD-ILAB-SSF wrote:
> I send again this message as it apparently didn't go through.
> (I am messing up with my email addresses on the mailing list...) 
>
> -----Original Message-----
> Sent: Friday, February 02, 2007 10:29 AM
>
> Using Nutch 0.8, we modified the code starting at the fetching/parsing steps 
> and the following.
> We have a different implementation of the Parse Object and OutputFormat 
> including an additional list of ParseData objects saved in an additionnal 
> subfolder in the DFS.
> We changed the indexing step a lot too, so we don't use the nutch code there.
>   
Is your implementation similar to what we started at 
https://issues.apache.org/jira/browse/NUTCH-443? If you think some of 
your changes could be integrated, please post a patch there.

Thanks for sharing,
Renaud
>
> -----Original Message-----
> From: Doug Cutting [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 02, 2007 10:19 AM
> To: nutch-dev@lucene.apache.org
> Subject: Re: RSS-fecter and index individul-how can i realize this function
>
> Attention, votre correspondant continue de vous écrire à votre ancienne 
> adresse en @orange-ft.com, qui va être désactivée début avril. Veuillez lui 
> demander de mettre à jour son carnet d'adresses avec votre nouvelle adresse 
> en @orange-ftgroup.com.
>
> Caution : your correspondent is still writing to your orange-ft.com address, 
> which will be disabled beginning of April. Please ask him/her to update 
> his/her address book to orange-ftgroup.com 
> ..................................................
>
> Gal Nitzan wrote:
>   
>> IMHO the data that is needed i.e. the data that will be fetched in the next 
>> fetch process is already available in the <item> element. Each <item> 
>> element represents one web resource. And there is no reason to go to the 
>> server and re-fetch that resource.
>>     
>
> Perhaps ProtocolOutput should change.  The method:
>
>    Content getContent();
>
> could be deprecated and replaced with:
>
>    Content[] getContents();
>
> This would require changes to the indexing pipeline.  I can't think of
>
> any severe complications, but I haven't looked closely.
>
> Could something like that work?
>
> Doug
>
>
>   


-- 
Renaud Richardet                                      +1 617 230 9112
my email is my first name at apache.org      http://www.oslutions.com


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to