On Tue, Jul 22, 2003 at 09:12:04AM -0700, Ian Clarke wrote:
> Thanks for the info.  I would be very curious, given your understanding
> of what is available in the ML field, how you think it might compare to
> our current "home grown" solution?  Are there specific advantages you
> can identify?

Let me begin by disclaiming all of this reply with the caveat:  Only
testing and measuring can tell us if it will really work better or not;
ML is complex and varied.  That said, SVM's usually compare favorably
to Kohonen Maps (SOM's) in terms of accuracy.  Furthermore, SVM's
achieve the highest accuracy of any learning algorithm in many
test cases;  On the libsvm web page link I posted earlier, you can see that
libsvm was used successfully to win two different ML contests.  Also,
SVM's have outperformed neural nets in OCR and face recognition tasks.
I would also not be surprised if the SVM was able to learn more
rapidly than Kohonen Maps, in terms of how much sample data was
necessary before it figures out what we'd like it to.  But we will have
to wait for numbers to find out if libsvm is really so much better
to justify its weight.


> Certainly, one of the areas we have been discussing and where there is 
> almost certainly room for improvement in our current solution is how 
> quickly the current algorithm "forgets" past data.  At present we are 
> really operating on the basis of guess-work in this regard, and it would 
> definitely be interesting if there was some algorithmic way to determine 
> how forgetful our continuous averages should be.

For starters, I think it's simple enough to do something like this:

Each time a new training sample x arrives:
Add x to the training data set
If sizeof(training data) > 1000 then remove a random point from the set

Then do continuous retraining in a seperate thread (this is explained
in more detail in my other post)

> The nice thing about NG is that once we have a working implementation, 
> we can empirically test improvements to it (and/or completely different 
> implementations) to see how successful each is at accurately estimating 
> routing times.  This will lead to a much more orderly development 
> process.  Matthew (our full-time developer and therefore primary code 
> monkey ;) is going to modify the NG routing code to encapsulate the 
> estimation stuff in a single class, so that third-parties can easily 
> provide their own implementations to test out new ideas.

Yes I agree 100% this is the most important and critical first step.

Rudi

_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Reply via email to