Sorry to just jumpping in.
We have doc id associated when we index.  We could store the doc id in mysql 
table.We could use the docid to query the nutch database..
When parsing, capture things needed as part of "metadata"
Index the metadata. the docId associated is stored in mysql.

Does that give any idea ?...
Please do share your concerns. I am working on a similar stuff where eventually 
we have to adopt a database.

Thanks



John Reidy <[EMAIL PROTECTED]> wrote: I am looking at something similar.

I would guess the place to put it is the indexer. As I understand it the 
parser runs for just about everything fetched, however the indexer is 
only run for pages you want to index.
I am also looking at having static objects (Eg a connection) that is 
initialise when the plugin is loaded, ideally through the startup method.

Regards

John

>Hey all,
>I have writen a custom HTML parser and indexer.  I would like to save some
>information that I have gathered during the parse in a Mysql DB.  I imagine
>there could be some performance hit here (e.g. connecting to db).  What's
>the best place to add code to save this information - the parser or the
>indexer?
>
>-Mike
>--
>View this message in context: 
>http://www.nabble.com/Saving-Metadata-to-Mysql-t1389216.html#a3732992
>Sent from the Nutch - User forum at Nabble.com.
>
>  
>




  Sudhi Seshachala
  http://sudhilogs.blogspot.com/
   


                
---------------------------------
How low will we go? Check out Yahoo! Messenger’s low  PC-to-Phone call rates.

Reply via email to