Sorry to just jumpping in. We have doc id associated when we index. We could store the doc id in mysql table.We could use the docid to query the nutch database.. When parsing, capture things needed as part of "metadata" Index the metadata. the docId associated is stored in mysql.
Does that give any idea ?... Please do share your concerns. I am working on a similar stuff where eventually we have to adopt a database. Thanks John Reidy <[EMAIL PROTECTED]> wrote: I am looking at something similar. I would guess the place to put it is the indexer. As I understand it the parser runs for just about everything fetched, however the indexer is only run for pages you want to index. I am also looking at having static objects (Eg a connection) that is initialise when the plugin is loaded, ideally through the startup method. Regards John >Hey all, >I have writen a custom HTML parser and indexer. I would like to save some >information that I have gathered during the parse in a Mysql DB. I imagine >there could be some performance hit here (e.g. connecting to db). What's >the best place to add code to save this information - the parser or the >indexer? > >-Mike >-- >View this message in context: >http://www.nabble.com/Saving-Metadata-to-Mysql-t1389216.html#a3732992 >Sent from the Nutch - User forum at Nabble.com. > > > Sudhi Seshachala http://sudhilogs.blogspot.com/ --------------------------------- How low will we go? Check out Yahoo! Messengers low PC-to-Phone call rates.
