Rudi, this is very interesting. What were the CPU and memory requirements for this experiment?
It is also rather surprising that the BDA algorithm without useNearest performed *better* than BDA with useNearest - do you have any hypothesis as to why this might be? It might indicate a problem with the BDA implementation. Also, a decay rate of 0.5 is quite high, it might be interesting to see what happens with a lower decay rate and more data. In a typical implementation, how many bins were there, and what were the ranges of document sizes? If there were too few bins, then it could be that the BDA was suffering due to its coarseness with document size. It would be very interesting to see how well our RoutingTimeEstimator class performs with the same data (perhaps using the document size instead of the "key"), since, as it doesn't use a "binned" approach, its performance is likely to be superior to BDA. If I recall correctly, Hui Zhang <[EMAIL PROTECTED]>, a PhD student at the University of Southern California did some testing using collected response time data from Freenet of our ResponseTimeEstimator class. More interesting still would be to measure performance using Hui's data as it will be much closer to the actual data our simulator is likely to collect, in particular - it will be *much* more noisy than I suspect the data was in this experiment - and it has been suggested that SVM might not be as good with very noisy data. Ian. On Thu, Jul 24, 2003 at 08:26:36AM -0700, Rudi Cilibrasi wrote: > Hi everybody, > > As usual, there are many differing predictions on whether or not SVM's will > work better than other algorithms. I hate speculating without data, so > I've put together a simple experiment involving predicting webserver > fetch times for 30 different RFC's of varying lengths. You can see > the details at > > http://homepages.cwi.nl/~cilibrar/ngrouting/ > > I invite you all to play around with these programs, particularly trying to > tweak the decaying averaging parameters to get better results, or even > rerun the whole experiment with data samples from your location. My > interpretation is that SVM's outperform two different obvious "binned" > decaying average techniques considerably, by 40% or more. I think > these results suggest that the SVM approach may actually be very much more > powerful than what is currently in use. I can't wait to integrate this > prediction system into the Freenet code proper, as I suspect it will > increase overall throughput measurably. > > Cheers, > > Rudi > > _______________________________________________ > devl mailing list > [EMAIL PROTECTED] > http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl -- Ian Clarke [EMAIL PROTECTED] Coordinator, The Freenet Project http://freenetproject.org/ Founder, Locutus http://locut.us/ Personal Homepage http://locut.us/ian/
pgp00000.pgp
Description: PGP signature
