Raised this for checkpointing, hopefully it gets some priority as it's very
useful and relatively straightforward to implement ?
https://issues.apache.org/jira/browse/SPARK-11879
On 18 November 2015 at 16:31, Cristian O
wrote:
> Hi,
>
> While these OSS efforts
Hi,
While these OSS efforts are interesting, they're for now quite unproven.
Personally would be much more interested in seeing Spark incrementally
moving towards supporting updating DataFrames on various storage
substrates, and first of all locally, perhaps as an extension of cached
DataFrames.
FiloDB is also closely reated. https://github.com/tuplejump/FiloDB
On Mon, Nov 16, 2015 at 12:24 AM, Nick Pentreath
wrote:
> Cloudera's Kudu also looks interesting here (getkudu.io) - Hadoop
> input/output format support:
>
This (updates) is something we are going to think about in the next release
or two.
On Thu, Nov 12, 2015 at 8:57 AM, Cristian O wrote:
> Sorry, apparently only replied to Reynold, meant to copy the list as well,
> so I'm self replying and taking the opportunity
Relevant link:
http://spark.apache.org/docs/latest/sql-programming-guide.html#parquet-files
On Wed, Nov 11, 2015 at 7:31 PM, Reynold Xin wrote:
> Thanks for the email. Can you explain what the difference is between this
> and existing formats such as Parquet/ORC?
>
>
> On
Sorry, apparently only replied to Reynold, meant to copy the list as well,
so I'm self replying and taking the opportunity to illustrate with an
example.
Basically I want to conceptually do this:
val bigDf = sqlContext.sparkContext.parallelize((1 to 100)).map(i
=> (i, 1)).toDF("k", "v")
val
Hi,
I was wondering if there's any planned support for local disk columnar
storage.
This could be an extension of the in-memory columnar store, or possibly
something similar to the recently added local checkpointing for RDDs
This could also have the added benefit of enabling iterative usage for
Thanks for the email. Can you explain what the difference is between this
and existing formats such as Parquet/ORC?
On Wed, Nov 11, 2015 at 4:59 AM, Cristian O wrote:
> Hi,
>
> I was wondering if there's any planned support for local disk columnar
> storage.
>