On Sat, Jul 27, 2013 at 9:40 AM, Dmitriy Lyubimov <[email protected]> wrote:

> Jake, this is in-core. I work on similar expressiveness for spark backed
> DRMs and there are indeed different set of algorithms there and naive
> combinations are not necessarily producing the best outcome. There's no
> doubt MR stuff will need amended set of operations and primitives.
>
> As far as associativity is concerned, this is just scala . One cannot
> implement elementwise 5-x as 5:-x or 5-x on a sealed left-hand argument,
>  such as the language rules and no amount of discussion on our side can
> change that. You can only do 5 -: x .
>

Can you show me some examples of where I'd *want* to do the "wrong thing"
from an associativity standpoint?  "5 - x" where x is a vector, is kinda
weird.
But maybe you're subtracting off a mean or something, but then I'd probably
write this as "- (x - 5)", because I always associate left to right. :)


> and putting completely different functional meaning into :* and * will
> confuse scala users to no end who got used to things like :/ and /: . This
> all needs striking a subtle balance unfortunately.
>

Ok, then like I said, maybe I'll just defer to your judgement on the
operator
syntax, as I've *never* gotten used to the scala :/ and /: uses.   I prefer
method calls to method calls masquerading as native operators.  Maybe
I should Stop Being Afraid and Learn to Love the DSL, but I'm not quite
there yet: Too Much Magic. :)


>
> as i said before, i am not hung on %*% syntax, but i don't think doing :*
> or .* for elementwise would work on scala.
>

How often do we really do elementwise matrix operations?  Is this really
a thing we often want to worry about?  addition and subtraction, sure, but
that's the full matrix operation too.  Ditto for multiplication or division
*by scalars*, but Hadamard products on matrices?  I guess it _happens_,
but I'm not sure I've ever done it, or if I have, it's pretty darn rare.


>
>
>
> On Sat, Jul 27, 2013 at 8:00 AM, Jake Mannix <[email protected]>
> wrote:
>
> > I think my main concern is one of readability and hidden information: I
> > really _don't_ like having to know _anything_ about associativity rules,
> > and I'm not sure that catering to R users (*or* matlab users) is what we
> > want to do.  Maybe I'm thinking in a different direction with my scala
> > (+scalding) interop work, but I really am not aiming for some totally
> > fluent API for non-programmer analysts.  I'm not one, and guessing their
> > needs will be really hard for me.  I just want more concise syntax,
> better
> > types, access to a nice REPL, and access to a much more sophisticated yet
> > compact MR pipelining DSL.  For this, scala + scalding serves admirably.
> >
> >
> > On Sat, Jul 27, 2013 at 6:10 AM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > > On Jul 26, 2013 11:56 PM, "Nick Pentreath" <[email protected]>
> > > wrote:
> > > >
> > > > Thanks for the update on that PR I will definitely take a look.
> > > >
> > > >
> > > > I wonder if they will run into the exact same Colt issues as mahout
> > did?!
> > >
> > > Yes i wondered that too since the day i saw spark als example.
> > >
> > > Jblas is far better choice but as Sebastian has demonstrated bona fide
> > > improvements are hard to achieve due to high jni costs, so i would
> > actually
> > > have a specific type of matrix to solve specific probems when needed
> > rather
> > > than sweepingly generalize it as a dense vector or matrix support.
> > >
> > > Aside from that, it seems lapack backend is running up to 5x slower on
> > amd
> > > hardware that our company unfortunately chose to invest in... argh!..
> > >
> > > >
> > > >
> > > > This DSL looks great, I'm gonna play around with it as soon as I get
> a
> > > chance.
> > > >
> > > >
> > > >
> > > > One question - breeze has quite a similar syntax that is a bit
> simpler
> > in
> > > some ways - basically * for matrix multiply and :* for elementwise.
> Would
> > > something similar work here?
> > >
> > > As i commented before, it just caters to R syntax, along with bunch of
> > > other things. If we beleive that there is a reason to inherit syntax vs
> > > devising something new, then there are really few candidates, and i
> dont
> > > think Breeze is going to cut it based on adoption level.
> > >
> > > In particular, in my company it is hard to convince R users to start
> > using
> > > scala or java as it is, so I am just scoring points here by making it
> > look
> > > familiar to them.
> > >
> > > Also i want to reserve the colon to command associativity of operation,
> > as
> > > scala means it, which is important for optimizing non commutative
> > > operations such as elementwise division or matrix multiplication. E.g.
> > > there are significant peroformance differences between saying
> > >
> > >
> > Maybe I should step out of the discussion where it dives into what
> > operators we use, because frankly, I probably won't use them much,
> > *especially* if there is too much magical associativity rules I have to
> > remember - I *hate* stuff like:
> >
> >
> >
> > > A %*% diagonal === A.times(diagonal)
> > >
> > > And
> > >
> > > A %*%: diagonal === diagonal.timesLeft(A).
> > >
> >
> > In particular, pretty much whenever we're going to be doing a map-reduce
> > job in a method call (for the distributed case), being terribly clever in
> > our syntax is going to bite us, because people (esp. typical R users, who
> > aren't super performance focused) will be doing stuff like "(A.t %*%
> B).t -
> > (A.t %*% A)" without thinking whether this can be reorganized at all to
> > reduce the number of map-reduce passes.  Maybe that's ok, but they're
> going
> > to super-complain on the list all the time if we give them too much rope
> to
> > hang themselves with.
> >
> > But yeah, maybe we'll just be looking at two different focuses on this: I
> > really care more about writing nicer MR pipelines for our jobs (I've
> > already played around with a nice replacement for seq2sparse in a single
> > small scalding job with modular components, it's about 1/10th the number
> of
> > lines of our current one, with most of the functionality), and getting a
> > nice integrated REPL for playing with the results.
> >
> > And maybe getting R (and matlab) users to use our stuff is a good thing,
> > even if it means them hanging themselves a bit.  Heh.
> >
> >
> > > Obviously the latter is n flops and the former is n squared.
> > >
> > > I dont think breeze made a wise decision by putting a special
> functional
> > > meaning into :  . It is reserved for associativity in scala.
> > >
> > > >
> > > >
> > > > Would be quite nice to have same syntax but different backends that
> are
> > > swappable ;)
> > > > —
> > > > Sent from Mailbox for iPhone
> > > >
> > > > On Sat, Jul 27, 2013 at 2:42 AM, Dmitriy Lyubimov <[email protected]
> >
> > > > wrote:
> > > >
> > > > > coincidentally, spark mlib just posted a pull request intended to
> add
> > > > > support for dense and sparse vectors, looks quite similar.
> > > > > https://github.com/mesos/spark/pull/736. They seem to choose JBlas
> > > backing
> > > > > for dense stuff (although at a vector level there's probably not
> much
> > > > > reason to) and as-is Colt for sparse stuff.
> > > > > On Fri, Jul 26, 2013 at 5:20 PM, Dmitriy Lyubimov <
> [email protected]
> > >
> > > wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Jul 26, 2013 at 5:07 AM, Ted Dunning <
> [email protected]
> > > >wrote:
> > > > >>
> > > > >>> This sounds great in principle.  I haven't seen any details yet
> > > (haven't
> > > > >>> had time to look).
> > > > >>>
> > > > >>> Is there a strong reason to go with the R syntax for
> multiplication
> > > > >>> instead
> > > > >>> of the matlab convention that a*b means a.times(b)?
> > > > >>>
> > > > >>
> > > > >> As discussed, but also because matlab style elementwise operators
> > are
> > > > >> impossible to keep at proper precedence level in scala. It kind of
> > has
> > > to
> > > > >> start with either '*' or '%' to keep proper precedence, '.*' will
> > not
> > > work
> > > > >> unfortunately. And mix along the lines "some of Matlab, some of
> > > perhaps
> > > > >> completely something else' does not seem appealing at all.
> > > > >>
> > > > >>
> > >
> >
> >
> >
> > --
> >
> >   -jake
> >
>



-- 

  -jake

Reply via email to