Rudi Cilibrasi <[EMAIL PROTECTED]> wrote:
> On Tue, Jul 22, 2003 at 12:27:33PM -0400, Zlatin Balevsky wrote:
> > Also, what would the cpu requirements be for the current case 
> > (dimensional vectors) if the algorithm was used on between 10 to 100 
> > learning samples up to 500 times per second?  Is adding new samples 
> > expensive?  Roughly one/tenth of the queries on freenet result in data 
> > being found and we may want to adjust for that.
> 
> SVM's work in two modes: Regression or Classification.  In both cases,
> they receive a sequence of training samples, and are then asked to
> make predictions.  Prediction is fast.  Training can be slower, but
> you can make it less expensive by training on batches of input data
> instead of on each new sample seperately.

Is there any way of doing the training incrementally?  That is,
instead of retraining the model from scratch each time, is there an
algorithm that can incorporate new observations as they come in?  For
the classification mode in particular, it seems that many samples may
not change the model at all if they are sufficiently far away from the
hyperplane dividing SUCCESS from FAIL.  Similarly for forgetting old
data - if a sample is not a support vector, deleting it will have no
effect, right?

I guess there's a question of how fast the training algorithm is
relative to the rate of requests (i.e. samples).

theo

p.s. Yes, I'm back - I successfully passed my thesis defense a couple
of weeks ago, so now I'm all done with the PhD!


-- 
Theodore Hong         Dept. of Computing, Imperial College London
[EMAIL PROTECTED]   180 Queen's Gate, London SW7 2BZ
PGP key: http://www.doc.ic.ac.uk/~twh1/

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to