Hi,
I want to use nutch as an environment to test my proposed algorithm for web mining

1- Where exactly does the nutch score take place ? in which packages or files?
Check the latest sources, there is a new Scoring API and a default plugin based implementation of OPIC.

2- Can the LinkAnalysisTool be run at the intranet level?, some documents mentioned that it can take place only at the whole web crawling level
That question is not clear to me, however you can easily hack the code to only process pages you are interested in. Also there are other workaround as well, e.g. just fetch intranet pages if that works etc.

3- what technologies and concepts that i must be familiar with to get into nuch development?

You should be able to write map reduce jobs and understand the hadoop io package if you want to do some custom analyzes.

HTH
Stefan



-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to