Peng, I'm not sure if you were referring to what I wrote:

> >>      1.    The Kernel projections

if so-  I was talking about parallelizing the computation of eg. the RBF Kernals

> Date: Wed, 22 Oct 2014 13:42:45 -0400
> From: [email protected]
> To: [email protected]
> CC: [email protected]
> Subject: Re: Any idea which approaches to non-liniear svm are easily 
> parallelizable?
> 
> Is the kernel projection referring to online/incremental incomplete 
> Cholesky decomposition? Sorry I haven't used SVM for a long time and 
> didn't keep up with SotA.
> 
> If that's true, I haven't find an out-of-the-box implementation, but 
> this should be easy.
> 
> Yours Peng
> 
> On 10/22/2014 01:32 PM, Dmitriy Lyubimov wrote:
> > Andrew, thanks a bunch for the pointers!
> >
> >
> >
> > On Wed, Oct 22, 2014 at 10:14 AM, Andrew Palumbo <[email protected]> wrote:
> >
> >> If you do want to stick with SVM-
> >>
> >> This is a question that I keep coming back myself to and unfortunately
> >> have forgotten more (and lost more literature) than I’ve retained.
> >> I believe that the most easily parallelizable sections of libSVM for small
> >> datasets are (for C-SVC(R), RBF and polynomial kernels):
> >>

> >>      2.    The Hyper-Paramater grid search for C,  \gamma (I believe this
> >> is now included in LibSVM- I havent looked at it in a while)
> >>      3.    For multi-class SVC:  the concurrent computation of each SVM for
> >> each one-against-one class vote.
> >>
> >> I’m unfamiliar with any easily parallizable method for QP itself.
> >> Unfortunately for (2), (3) this involves broadcasting the entire dataset
> >> out to each node of a cluster (or working in a shared memory environment)
> >> so may not be practical depending on the size of your data set.  I’ve only
> >> ever implemented (2) for relatively small datasets using MPI and and a with
> >> pure java socket implementation.
> >>
> >> Other approaches (further from simple LibSVM), which are more applicable
> >> to large datasets (I’m less familiar with these):
> >>
> >>      4.    Divide and conquer the QP/SMO problem and solve (As I’ve said,
> >> I’m unfamiliar with this and  I don’t know of any standard)
> >>
> >>      5.    Break the training set into subsets and solve.
> >>
> >> For (5) there are several approaches, two that I know of are ensemble
> >> approaches and those that accumulate Support Vectors from each partition
> >> and heuristically keep/reject them until the model converges.  As well I’ve
> >> read some just read some research on implementing this in map a MapReduce
> >> style[2].
> >>
> >> I came across this paper [1] last night which you may find interesting as
> >> well which is an interesting comparison of some SVM parallelization
> >> strategies, particularly it discusses (1) for a shared memory environment
> >> and for offloading work to GPUs (using OpenMP  and CUDA). It also cites
> >> several other nice papers discussing SVM parallelization strategies
> >> especially for (5).  Also then goes on to discuss more purely linear
> >> algebra approach to optimizing SVMs (sec. 5)
> >>
> >> Also regarding (5) you may be interested in [2] (something I’ve only
> >> looked over briefly).
> >>
> >> [1] http://arxiv.org/pdf/1404.1066v1.pdf
> >> [2] http://arxiv.org/pdf/1301.0082.pdf
> >>
> >>
> >>> From: [email protected]
> >>> Date: Tue, 21 Oct 2014 17:32:22 -0700
> >>> Subject: Re: Any idea which approaches to non-liniear svm are easily
> >> parallelizable?
> >>> To: [email protected]
> >>>
> >>> Last I heard, the best methods pre-project and do linear SVM.
> >>>
> >>> Beyond that, I would guess that deep learning techniques would subsume
> >>> non-linear SVM pretty easily.  The best parallel implementation I know
> >> for
> >>> that is in H2O.
> >>>
> >>>
> >>>
> >>> On Tue, Oct 21, 2014 at 4:12 PM, Dmitriy Lyubimov <[email protected]>
> >> wrote:
> >>>> in particular, from libSVM --
> >>>> http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf ?
> >>>>
> >>>> thanks.
> >>>> -d
> >>>>
> 
                                          

Reply via email to