Great, file an issue please. On Jun 17, 2015 7:07 PM, "Sreenivas Raghavan" <[email protected]> wrote:
> I am interested in trying Cholesky issue > > On Wed, Jun 17, 2015 at 11:18 PM, Dmitriy Lyubimov <[email protected]> > wrote: > > > Thanks for doing this. this is greatly appreciated. > > > > What about that Cholesky issue? any takers? > > > > On Wed, Jun 17, 2015 at 12:34 AM, Rohit Shinde < > > [email protected]> > > wrote: > > > > > Okay, so I'll get started with fixing the mahout spark shell. I'll ask > > > issues on the mailing list as and when I encounter them. I'll go slowly > > > though. I have GSoC going on and I will not be able to dedicate much > time > > > for the next two months. > > > > > > On Wed, Jun 17, 2015 at 2:21 AM, Dmitriy Lyubimov <[email protected]> > > > wrote: > > > > > > > Guys, please file a Jira issue for Cholesky. this needs a bit of > > > > investigation. I don't really know who wants to pick it. > > > > > > > > Mathematical problems -- i assume basic ones -- we need MVN and > Wishart > > > > multivariate distribution implementations which do not depend on > > > > apache-math or any other 3rd party, as well as Gaussian process. I am > > > > willing to outsource those to a first taker :-) > > > > > > > > for non-basic ones, as i mentioned, please scan the world :-) Topical > > > stuff > > > > would be nice to port back, like LDA CVB0 (although i think i read a > > > paper > > > > that basically goes back to gibbs sampling technique and now it is > > > somehow > > > > more fashionable way than variational bayes again for some reason:) > > > > > > > > > > > > On Tue, Jun 16, 2015 at 1:34 PM, Nikolis Galerakis < > > [email protected] > > > > > > > > wrote: > > > > > > > > > Hello > > > > > > > > > > I am really interested on Cholesky Decomposition is there any > process > > > > that > > > > > I should follow to get assigned > > > > > this task or I should just dive into it ? > > > > > > > > > > Nikos > > > > > > > > > > > > > > > 2015-06-16 20:48 GMT+02:00 Sreenivas Raghavan < > > > > > [email protected] > > > > > >: > > > > > > > > > > > Sir, > > > > > > I am interested in such kind of mathematical problems. Can > you > > > stat > > > > > few > > > > > > more? > > > > > > > > > > > > On Tue, Jun 16, 2015 at 10:29 PM, Dmitriy Lyubimov < > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > (1) Yes, making spark shell work with spark 1.3+ on > 0.11-snapshot > > > > would > > > > > > be > > > > > > > an awesome help. > > > > > > > (2) I was thinking, if you are still into math problem, we > have, > > in > > > > my > > > > > > > view, a problem in CholeskyDecomposition. > > > > > > > > > > > > > > This needs a little research. This involves methods solveRight, > > > > > > solveLeft. > > > > > > > (2a) solveLeft claims to do forward substitution (which it > does), > > > and > > > > > > > solveRight claims to do back substitution, which it probably > does > > > > too. > > > > > > But > > > > > > > in reality it solves a different problem it is supposed to. In > > > > classic > > > > > > > scheme of things, if AX=B is positive (semi)definite, and A=LL' > > > > > Cholesky > > > > > > > decomposition, then forward substitution is supposed to solve > > LY=B > > > > for > > > > > Y > > > > > > > and back substitution is supposed to solve L'X=Y, i.e. back > > > > > substitution > > > > > > is > > > > > > > supposed to compute result of L'^-1Y. But current > implementation > > > does > > > > > > > something that can be shown to be essentially equivalent to > > > > solveLeft() > > > > > > > rather than solution for L'X=Y. This needs to be looked at more > > > > > carefully > > > > > > > > > > > > > > (2b) I also believe the whole names ofr solveLeft, solveRight > are > > > > > > > misleading. In all other cases, solve() methods traditionally > > > denote > > > > > > > solution of AX=B or XA=B for X. In Cholesky, neither of these > > > methods > > > > > > > actually provides a solution for AX=B, but rather provides a > part > > > of > > > > > the > > > > > > > solution. Therefore, i think, these methods should be renamed > to > > > > > > something > > > > > > > like forwardSubs(), backSubs(), or better yet, name exactly > what > > > they > > > > > are > > > > > > > doing, e.g. computeLtInvZ(mxZ:Matrix). more over, it is > probably > > > > > > beneficial > > > > > > > to have solve methods that actually do compute full solution of > > > Ax=b > > > > or > > > > > > xA > > > > > > > = b' by combining forward and back substitutions properly. > > > > > > > > > > > > > > I hope some of this fits, it takes time to write this. > > > > > > > > > > > > > > -Dmitriy > > > > > > > > > > > > > > On Tue, Jun 16, 2015 at 4:17 AM, Rohit Shinde < > > > > > > [email protected] > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Okay, it seems that methodology is a bit too advanced for > me. I > > > > would > > > > > > go > > > > > > > > with framework/engineering tasks. So should I start with > fixing > > > the > > > > > > > mahout > > > > > > > > spark shell? > > > > > > > > > > > > > > > > On Tue, Jun 16, 2015 at 11:20 AM, Dmitriy Lyubimov < > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > As i said, in methodology you can pick _anything_ that you > > > think > > > > > has > > > > > > > > merit > > > > > > > > > and not yet in the roadmap or done. > > > > > > > > > > > > > > > > > > For example, do you feel like you might research PSVM or > > > interior > > > > > > point > > > > > > > > > SVM? Actually, any flavor of non-linear SVM that is > different > > > > from > > > > > a > > > > > > > > simple > > > > > > > > > hinge loss? > > > > > > > > > Do you think you can fit it in our algebraic engine? > > > > > > > > > > > > > > > > > > I think we also need a fair amount of port of MR methods -- > > > like > > > > > > > > seq2sparse > > > > > > > > > and cvb0 lda. > > > > > > > > > > > > > > > > > > i would still look at framework performance tasks, they are > > > badly > > > > > > > needed. > > > > > > > > > Just today listened about flyby matrix multiplication > > approach > > > > for > > > > > > > spark > > > > > > > > > for medium-sized matrices which probably beats our since > even > > > > > though > > > > > > we > > > > > > > > do > > > > > > > > > not use cartesian (god forbid), our implementation is > > somewhat > > > > > closer > > > > > > > to > > > > > > > > > what the speaker described as "massively mapside join" -- > > which > > > > > > > > eventually, > > > > > > > > > according to him, is supposed to gain over flyby multiply, > > but > > > > > > there's > > > > > > > a > > > > > > > > > fair amount of tasks when it is not . > > > > > > > > > > > > > > > > > > similarly bolting on hardware libraries for in-core > > operations > > > is > > > > > > > still a > > > > > > > > > big undecided issue. > > > > > > > > > > > > > > > > > > unfortunately a lot of known outstanding issues are still > > about > > > > > > > > > engineering. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 15, 2015 at 10:27 PM, Rohit Shinde < > > > > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I would prefer some methodology work if it falls within > my > > > > > > > > capabilities. > > > > > > > > > If > > > > > > > > > > it doesn't then your suggestion is a good one and I'll > take > > > it > > > > > up. > > > > > > > > > > Substantial according to me means a task where I can get > > > quite > > > > > > > familiar > > > > > > > > > > with as much of the code base as possible. > > > > > > > > > > > > > > > > > > > > On Tue, Jun 16, 2015 at 10:49 AM, Dmitriy Lyubimov < > > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > I gave you 3 types of problems. Define substantial. > > > > > > > > > > > > > > > > > > > > > > Say, does fixing mahout spark shell sound substantial > > > enough? > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 15, 2015 at 10:11 PM, Rohit Shinde < > > > > > > > > > > > [email protected]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > So do you have any suggestions for getting started? I > > > would > > > > > > like > > > > > > > to > > > > > > > > > > > > contribute to something substantial that is going on, > > > after > > > > > > > getting > > > > > > > > > > > > familiar with the required part of the codebase. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 15, 2015 at 11:39 PM, Dmitriy Lyubimov < > > > > > > > > > [email protected]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > i don't think there's a formal list published > > anywhere. > > > > > > > > > > > > > > > > > > > > > > > > > > There is an informal roadmap. > > > > > > > > > > > > > > > > > > > > > > > > > > The contributions are, the way i see it, mainly can > > be > > > > in 3 > > > > > > > > areas: > > > > > > > > > > (1) > > > > > > > > > > > > > project support issues like for example fixing > shell > > > > > > > > compatibility > > > > > > > > > > with > > > > > > > > > > > > > spark 1.3; (2) framework support problems like for > > > > example > > > > > > > > > > performance > > > > > > > > > > > > and > > > > > > > > > > > > > integrating 3rd party hardware accelerated linalg > > > > > libraries; > > > > > > > (3) > > > > > > > > > > > > > methodology work. > > > > > > > > > > > > > > > > > > > > > > > > > > We have some pending items for (1) and (2) i think > > but > > > > for > > > > > > > > > > methodology > > > > > > > > > > > > > items (3) we simply can't compile the list of > > > everything > > > > > that > > > > > > > can > > > > > > > > > > > > possibly > > > > > > > > > > > > > be done and contriubted. We just don't have that > much > > > > > > > expertise, > > > > > > > > > > > > combined. > > > > > > > > > > > > > No one has [1]. The way it works is usually people > > > would > > > > > come > > > > > > > up > > > > > > > > > with > > > > > > > > > > > > > pieces that they were missing on their own for some > > > > reason; > > > > > > and > > > > > > > > > they > > > > > > > > > > > need > > > > > > > > > > > > > to propose methodology, parallelization strategy, > > maybe > > > > > even > > > > > > a > > > > > > > > code > > > > > > > > > > > > sketch > > > > > > > > > > > > > -- that all will be fine. > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > http://matt.might.net/articles/phd-school-in-pictures/ > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Jun 14, 2015 at 11:49 PM, Rohit Shinde < > > > > > > > > > > > > > [email protected]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > But is there a list of projects that new people > > could > > > > > take > > > > > > > up? > > > > > > > > > > Even I > > > > > > > > > > > > am > > > > > > > > > > > > > a > > > > > > > > > > > > > > student interested in contributing to the machine > > > > > learning > > > > > > > and > > > > > > > > > data > > > > > > > > > > > > > mining > > > > > > > > > > > > > > parts of Apache Mahout. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am familiar with Scala and Java, Python and > C++. > > > > > > > > > > > > > > > > > > > > > > > > > > > > What can I contribute to? > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jun 15, 2015 at 10:24 AM, Dmitriy > Lyubimov > > < > > > > > > > > > > > [email protected]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Well we are predominantly Scala shop now. Being > > > > fluent > > > > > in > > > > > > > > Scala > > > > > > > > > > > seems > > > > > > > > > > > > > > like > > > > > > > > > > > > > > > one prerequisite. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Jun 13, 2015 at 1:17 AM, Sreenivas > > > Raghavan < > > > > > > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello everyone, > > > > > > > > > > > > > > > > I am interested in > > contributing > > > > to > > > > > > > mahout > > > > > > > > > > > > project. > > > > > > > > > > > > > I > > > > > > > > > > > > > > am > > > > > > > > > > > > > > > > interested in algorithms, machine learning > and > > > > linear > > > > > > > > > algebra. > > > > > > > > > > > > Please > > > > > > > > > > > > > > > give > > > > > > > > > > > > > > > > me some idea as where to start and how to > > start. > > > I > > > > > know > > > > > > > > > python > > > > > > > > > > > and > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > parts of Java, so please tell me is this > > > knowledge > > > > of > > > > > > > > > languages > > > > > > > > > > > > > enough > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > writing and optimizing codes > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *With Regards,* > > > > > > > > > > > > > > > > *K.S.Sreenivasa Raghavan* > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > *With Regards,* > > > > > > *K.S.Sreenivasa Raghavan* > > > > > > > > > > > > > > > > > > > > > > > > -- > > *With Regards,* > *K.S.Sreenivasa Raghavan* >
