Re: Apache Arrow + Spark examples?

2017-06-29 Thread Nirav Patel
bump.

I have same question at Petr. SPARK-13534 seem to only solve
de(serialization) issue involved between rdd and python objects. However,
isn't Arrow can be standard  for in-memory columnar representation? may be
alternative to spark current in-memory store (k-v blocks or tungsten)

Thanks
Nir

On Wed, Feb 24, 2016 at 3:56 AM, Petr Novak  wrote:

>
> How Arrows collide with Tungsten and its binary in-memory format. It will
> still has to convert between them. I assume they use similar
> concepts/layout hence it is likely the conversion can be quite efficient.
> Or is there a change that the current Tungsten in memory format would be
> replaced by Arrows in the future. The same applies for Impala, Drill and
> all others. Is the goal to unify internal in-memory representation for all
> of them or the benefit is going to be in conversions faster by e.g. order
> of magnitude?
>
> Many thanks for any explanation,
> Petr
>

-- 


[image: What's New with Xactly] 

  [image: LinkedIn] 
  [image: Twitter] 
  [image: Facebook] 
  [image: YouTube] 



Re: Apache Arrow + Spark examples?

2016-02-24 Thread Petr Novak
How Arrows collide with Tungsten and its binary in-memory format. It will
still has to convert between them. I assume they use similar
concepts/layout hence it is likely the conversion can be quite efficient.
Or is there a change that the current Tungsten in memory format would be
replaced by Arrows in the future. The same applies for Impala, Drill and
all others. Is the goal to unify internal in-memory representation for all
of them or the benefit is going to be in conversions faster by e.g. order
of magnitude?

Many thanks for any explanation,
Petr


RE: Apache Arrow + Spark examples?

2016-02-24 Thread Sun, Rui
Spark has not supported Arrow yet. There is a JIRA 
https://issues.apache.org/jira/browse/SPARK-13391 requesting working on it.

From: Robert Towne [mailto:robert.to...@webtrends.com]
Sent: Wednesday, February 24, 2016 5:21 AM
To: user@spark.apache.org
Subject: Apache Arrow + Spark examples?

I have been reading some of the news this week about Apache Arrow as a new top 
level project.  It appears to be a common data layer between Spark and other 
systems (Cassandra, Drill, Impala, etc).

Has anyone seen any sample Spark code that integrates with Arrow?

Thanks,
Robert