Hi,
I want to use nutch as an environment to test my proposed algorithm
for web mining
1- Where exactly does the nutch score take place ? in which
packages or files?
Check the latest sources, there is a new Scoring API and a default
plugin based implementation of OPIC.
2- Can the LinkAnalysisTool be run at the intranet level?, some
documents mentioned that it can take place only at the whole web
crawling level
That question is not clear to me, however you can easily hack the
code to only process pages you are interested in.
Also there are other workaround as well, e.g. just fetch intranet
pages if that works etc.
3- what technologies and concepts that i must be familiar with to
get into nuch development?
You should be able to write map reduce jobs and understand the hadoop
io package if you want to do some custom analyzes.
HTH
Stefan
-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general