The key tricks are:

- do the updates of the averaged model in a sparse fashion.  This will
require doubling the space kept by the model

- determine when to switch to averaging

In addition we should bring in at the same time

- more flexibility on loss function (to allow the code to implement SVM)


On Thu, Nov 17, 2011 at 2:26 AM, urun dogan <[email protected]> wrote:

> Hi Ted;
>
> I start to read the paper and I think I will finish it today. It is a quite
> nice approach and
> thanks for supervision.
>
> Cheers
> Ürün
>
> On Wed, Nov 16, 2011 at 8:14 PM, Ted Dunning <[email protected]>
> wrote:
>
> > On Wed, Nov 16, 2011 at 9:50 AM, urun dogan <[email protected]> wrote:
> >
> > >
> > > I have written the previous email before reading Josh's email. Are
> there
> > > any objections if I conclude that: implementation of SGD/ASGD based
> > methods
> > > have priority and therefore I will start implement these methods soon ?
> > >
> >
> > I think that they are important.  But I haven't been able to partition
> off
> > enough time to actually do it so my vote is degraded somewhat.  I do know
> > that people I have worked with would benefit from the results shown in
> the
> > Xu paper.
> >
> > @Ted: If this is the case, I am looking forward to have your supervision
> > > about this issue.
> > >
> >
> > Excellent.
> >
> > Have you looked at the Xu paper?
> >
>

Reply via email to