Rudi Cilibrasi <[EMAIL PROTECTED]> wrote: > On Tue, Jul 22, 2003 at 12:27:33PM -0400, Zlatin Balevsky wrote: > > Also, what would the cpu requirements be for the current case > > (dimensional vectors) if the algorithm was used on between 10 to 100 > > learning samples up to 500 times per second? Is adding new samples > > expensive? Roughly one/tenth of the queries on freenet result in data > > being found and we may want to adjust for that. > > SVM's work in two modes: Regression or Classification. In both cases, > they receive a sequence of training samples, and are then asked to > make predictions. Prediction is fast. Training can be slower, but > you can make it less expensive by training on batches of input data > instead of on each new sample seperately.
Is there any way of doing the training incrementally? That is, instead of retraining the model from scratch each time, is there an algorithm that can incorporate new observations as they come in? For the classification mode in particular, it seems that many samples may not change the model at all if they are sufficiently far away from the hyperplane dividing SUCCESS from FAIL. Similarly for forgetting old data - if a sample is not a support vector, deleting it will have no effect, right? I guess there's a question of how fast the training algorithm is relative to the rate of requests (i.e. samples). theo p.s. Yes, I'm back - I successfully passed my thesis defense a couple of weeks ago, so now I'm all done with the PhD! -- Theodore Hong Dept. of Computing, Imperial College London [EMAIL PROTECTED] 180 Queen's Gate, London SW7 2BZ PGP key: http://www.doc.ic.ac.uk/~twh1/
pgp00000.pgp
Description: PGP signature
