date:20141217

When will Spark SQL support building DB index natively?

2014-12-17 Thread Xuelin Cao

Hi, In Spark SQL help document, it says Some of these (such as indexes) are less important due to Spark SQL’s in-memory computational model. Others are slotted for future releases of Spark SQL. - Block level bitmap indexes and virtual columns (used to build indexes) For our

Re: RDD data flow

2014-12-17 Thread Madhu

Patrick Wendell wrote The Partition itself doesn't need to be an iterator - the iterator comes from the result of compute(partition). The Partition is just an identifier for that partition, not the data itself. OK, that makes sense. The docs for Partition are a bit vague on this point. Maybe

Re: running the Terasort example

2014-12-17 Thread Tim Harsch

On 12/16/14, 11:42 PM, Ewan Higgs ewan.hi...@ugent.be wrote: Hi Tim, On 16 Dec 2014, at 19:27, Tim Harsch thar...@cray.com wrote: Hi Ewan, Thanks, I think I was just a bit confused at the time, I was looking at the spark-perf repo when there was the problem (uh.. ok)… The PR that I am

Re: Nabble mailing list mirror errors: This post has NOT been accepted by the mailing list yet

2014-12-17 Thread Josh Rosen

Yeah, it looks like messages that are successfully posted via Nabble end up on the Apache mailing list, but messages posted directly to Apache aren't mirrored to Nabble anymore because it's based off the incubator mailing list. We should fix this so that Nabble posts to / archives the

Fwd: [VOTE] Release Apache Spark 1.2.0 (RC2)

2014-12-17 Thread Krishna Sankar

Forgot Reply To All ;o( -- Forwarded message -- From: Krishna Sankar ksanka...@gmail.com Date: Wed, Dec 10, 2014 at 9:16 PM Subject: Re: [VOTE] Release Apache Spark 1.2.0 (RC2) To: Matei Zaharia matei.zaha...@gmail.com +1 Works same as RC1 1. Compiled OSX 10.10 (Yosemite) mvn

Re: Spark Shell slowness on Google Cloud

2014-12-17 Thread Alessandro Baretta

Here's another data point: the slow part of my code is the construction of an RDD as the union of the textFile RDDs representing data from several distinct google storage directories. So the question becomes the following: what computation happens when calling the union method on two RDDs? On

When will Spark SQL support building DB index natively?

Re: RDD data flow

Re: running the Terasort example

Re: Nabble mailing list mirror errors: This post has NOT been accepted by the mailing list yet

Fwd: [VOTE] Release Apache Spark 1.2.0 (RC2)

Re: Spark Shell slowness on Google Cloud

6 matches

Site Navigation

Mail list logo

Footer information