Hi, I got my first ( very buggy and yet not so helpful, because the graph is not filtered enough ) Version of my Sitemap Visualization Tool working
( Can be visited under http://server01.pool.ifis.uni-luebeck.de:8080/cscrawler/login.html BUT PLEASE USE ONLY DEPTH 1 OR 2 (for small Sites 3) ) The Graph Visualisation is also buggy, you might click on a light blue node first to have the graph rebuilded correctly. Anyway: My Question: I d like to give the User a Top Ten of Words/Topics out of the Index of a crawled Site. So I ll have to get a list of words (filtered by a stopwordlist) out of the index. Can you tell me the easiest way to get a List of words out of the index, together with the count of how often the word is found in the index. When you have a look at luke, you ll see that feature, but not filtered. Thanks for your help. Nils Ps.: I d like to insert a picture like "powered by nutch" Is that ok? or who do I have to ask. (everything is just research, not commercial) ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
