Thanks again. Another simple question: What config files are neccessary to get the NutchBean working ?
Kindly //Marcus On 8/9/07, Renaud Richardet <[EMAIL PROTECTED]> wrote: > > Marcus Herou wrote: > > Hi. > > > > I might be total knucklehead but how do I use the NutchBean or > LinkDbReader? > > Are there any tutorial howto integrate this with apps? > NutchBean is used in the main jsp of the webapp, use it as an example. > LinkDbReader is used by NutchBean. > > Do I need to learn > > Hadoop and the MapReduce algorithm ? > No, that should not be necessary > > HTH, > Renaud > > I don't find the > > OpenSearchServlet.getServiceLocator.getNutchBean init process so easy to > > comprehend. What files are necessary to get a ServiceLocator ? Just a > > nutch-site.xml in classpath or all the files in WEB-INF/classes ? Please > > give me a pointer where to look for integrating with nutch.. > > > > Yes I have used GraphML before when I do my own crawling and will use > that > > this time again while using Nutch, but thanks for refreshing my mind! > > > > Let's say I create my own IndexReader what fields are available in the > link > > index and what dir should I point the IndexReader to for analyzing links > ? > > I guess this link says something about the structure... > > http://wiki.apache.org/nutch/IndexStructure > > > > Kindly > > > > //Marcus > > > > > > > > > > > > On 8/8/07, Renaud Richardet <[EMAIL PROTECTED]> wrote: > > > >> hi Marcus, > >> > >>> Hi. > >>> > >>> I have now crawled all the sites I want and is about to create an > >>> > >> undirected > >> > >>> unweighted graph of vertices and edges with Prefuse. > >>> > >>> So I have a question: > >>> > >>> How do I extract the in/out link info from Nutch > >>> > >> you could use NutchBean, or LinkDbReader, or a custom Lucene > searcher... > >> > >> what will be the semantic for vertices and a links? > >> > >>> and on what format ? > >>> > >>> > >> Prefuse has a built-in GraphMLReader (and writer), so I would > >> definitively go with that. Check the sample GraphML in prefuse/data. > >> > >> HTH, > >> Renaud > >> > >> > > > > > > > > > > -- Marcus Herou Solution Architect & Core Java developer Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com
