You should look at https://github.com/amplab/spark-indexedrdd
On Tue, Feb 10, 2015 at 2:27 PM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi Michael, > > I want to cache a RDD and define get() and set() operators on it. > Basically like memcached. Is it possible to build a memcached like > distributed cache using Spark SQL ? If not what do you suggest we should > use for such operations... > > Thanks. > Deb > > On Fri, Jul 18, 2014 at 1:00 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> You can do insert into. As with other SQL on HDFS systems there is no >> updating of data. >> On Jul 17, 2014 1:26 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote: >> >>> Is this what you are looking for? >>> >>> >>> https://spark.apache.org/docs/1.0.0/api/java/org/apache/spark/sql/parquet/InsertIntoParquetTable.html >>> >>> According to the doc, it says "Operator that acts as a sink for queries >>> on RDDs and can be used to store the output inside a directory of Parquet >>> files. This operator is similar to Hive's INSERT INTO TABLE operation in >>> the sense that one can choose to either overwrite or append to a directory. >>> Note that consecutive insertions to the same table must have compatible >>> (source) schemas." >>> >>> Thanks >>> Best Regards >>> >>> >>> On Thu, Jul 17, 2014 at 11:42 AM, Hu, Leo <leo.h...@sap.com> wrote: >>> >>>> Hi >>>> >>>> As for spark 1.0, can we insert and update a table with SPARK SQL, >>>> and how? >>>> >>>> >>>> >>>> Thanks >>>> >>>> Best Regard >>>> >>> >>> >