Jakub,

     This is great, thanks for the information. I've added links from the PETSc 
main webpage to your work.

   Barry

> On Sep 23, 2017, at 9:26 AM, Jakub Kruzik <[email protected]> wrote:
> 
> Dear all,
> 
> I would just like to note that we also develop SVM implementation. It is 
> intended for large-scale datasets and makes use of PETSc parallel linear 
> algebra. Currently, it supports only linear kernels - Hessian is, in fact, 
> MATNORMAL with arbitrary underlying data matrix - it is, e.g. possible to use 
> MATDENSE or MATAIJ depending on the problem. For the solution of the arising 
> quadratic program (QP), it uses solvers from our PermonQP package. Both 
> PermonSVM and PermonQP are libraries depending on PETSc. They are written in 
> the PETSc coding style, pretty much like SLEPc.
> 
> http://permon.it4i.cz/permonqp.htm
> http://permon.it4i.cz/permonsvm.htm
> 
> https://github.com/it4innovations/permon
> https://github.com/it4innovations/permonsvm
> 
> So far, PermonQP only implements an Augmented Lagrangian type algorithm which 
> can be combined with any solver for box-constrained QP. In PermonQP, there 
> are some concrete ones and also TAO wrapper. However, adding an Interior 
> Point implementation is interesting for us as well.
> 
> PermonSVM is so far a proof-of-concept thing, but it already scales pretty 
> well (almost proportionally to the application of the data matrix to a 
> vector). See, e.g. our PASC poster 
> https://www.researchgate.net/publication/318317204_PERMON_PASC17_Poster
> 
> We'll be grateful for any feedback on this.
> 
> Jakub
> 
> 
> On 22.9.2017 06:06, Richard Tran Mills wrote:
>> Thanks for sharing this, Barry. I haven't had time to read their paper, but 
>> it looks worth a read.
>> 
>> Hong, since many machine-learning or data-mining problems can be cast as 
>> linear algebra problems (several examples involving eigenproblems come to 
>> mind), I'm guessing that there must be several people using PETSc (with 
>> SLEPc, likely) in this this area, but I don't think I've come across any 
>> published examples. What have others seen?
>> 
>> Most of the machine learning and data-mining papers I read seem employ 
>> sequential algorithms or, at most, algorithms targeted at on-node 
>> parallelism only. With available data sets getting as large and easily 
>> available as they are, I'm surprised that there isn't more focus on doing 
>> things with distributed parallelism. One of my cited papers is on a 
>> distributed parallel k-means implementation I worked on some years ago: we 
>> didn't do anything especially clever with it, but today it is still one of 
>> the *only* parallel clustering publications I've seen.
>> 
>> I'd love to 1) hear about what other machine-learning or data-mining 
>> applications using PETSc that others have come across and 2) hear about 
>> applications in this area where people aren't using PETSc but it looks like 
>> they should!
>> 
>> Cheers,
>> Richard
>> 
>> On Thu, Sep 21, 2017 at 12:51 PM, Zhang, Hong <[email protected]> wrote:
>> Great news! According to their papers, MLSVM works only in serial. I am not 
>> sure what is stopping them using PETSc in parallel.
>> 
>> Btw, are there any other cases that use PETSc for machine learning?
>> 
>> Hong (Mr.)
>> 
>> > On Sep 21, 2017, at 1:02 PM, Barry Smith <[email protected]> wrote:
>> >
>> >
>> > From: Ilya Safro [email protected]
>> > Date: September 17, 2017
>> > Subject: MLSVM 1.0, Multilevel Support Vector Machines
>> >
>> > We are pleased to announce the release of MLSVM 1.0, a library of fast
>> > multilevel algorithms for training nonlinear support vector machine
>> > models on large-scale datasets. The library is developed as an
>> > extension of PETSc to support, among other applications, the analysis
>> > of datasets in scientific computing.
>> >
>> > Highlights:
>> > - The best quality/performance trade-off is achieved with algebraic
>> > multigrid coarsening
>> > - Tested on academic, industrial, and healthcare datasets
>> > - Generates multiple models for each training
>> > - Effective on imbalanced datasets
>> >
>> > Download MLSVM at https://github.com/esadr/mlsvm
>> >
>> > Corresponding paper: Sadrfaridpour, Razzaghi and Safro "Engineering
>> > multilevel support vector machines", 2017,
>> > https://arxiv.org/pdf/1707.07657.pdf
>> >
>> 
>> 
> 

Reply via email to