For anyone following along the chain went private for a bit, but there were
still issues with the bytecode generation in the 2.0-preview so this JIRA
was created: https://issues.apache.org/jira/browse/SPARK-15786
On Mon, Jun 6, 2016 at 1:11 PM, Michael Armbrust
wrote:
>
That kind of stuff is likely fixed in 2.0. If you can get a reproduction
working there it would be very helpful if you could open a JIRA.
On Mon, Jun 6, 2016 at 7:37 AM, Richard Marscher
wrote:
> A quick unit test attempt didn't get far replacing map with as[], I'm
A quick unit test attempt didn't get far replacing map with as[], I'm only
working against 1.6.1 at the moment though, I was going to try 2.0 but I'm
having a hard time building a working spark-sql jar from source, the only
ones I've managed to make are intended for the full assembly fat jar.
Option should place nicely with encoders, but its always possible there are
bugs. I think those function signatures are slightly more expensive (one
extra object allocation) and its not as java friendly so we probably don't
want them to be the default.
That said, I would like to enable that kind
Ah thanks, I missed seeing the PR for
https://issues.apache.org/jira/browse/SPARK-15441. If the rows became null
objects then I can implement methods that will map those back to results
that align closer to the RDD interface.
As a follow on, I'm curious about thoughts regarding enriching the
Thanks for the feedback. I think this will address at least some of the
problems you are describing: https://github.com/apache/spark/pull/13425
On Wed, Jun 1, 2016 at 9:58 AM, Richard Marscher
wrote:
> Hi,
>
> I've been working on transitioning from RDD to Datasets in
Hi,
I've been working on transitioning from RDD to Datasets in our codebase in
anticipation of being able to leverage features of 2.0.
I'm having a lot of difficulties with the impedance mismatches between how
outer joins worked with RDD versus Dataset. The Dataset joins feel like a
big step