You should look at https://github.com/amplab/spark-indexedrdd

On Tue, Feb 10, 2015 at 2:27 PM, Debasish Das <debasish.da...@gmail.com>
wrote:

> Hi Michael,
>
> I want to cache a RDD and define get() and set() operators on it.
> Basically like memcached. Is it possible to build a memcached like
> distributed cache using Spark SQL ? If not what do you suggest we should
> use for such operations...
>
> Thanks.
> Deb
>
> On Fri, Jul 18, 2014 at 1:00 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> You can do insert into.  As with other SQL on HDFS systems there is no
>> updating of data.
>> On Jul 17, 2014 1:26 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:
>>
>>> Is this what you are looking for?
>>>
>>>
>>> https://spark.apache.org/docs/1.0.0/api/java/org/apache/spark/sql/parquet/InsertIntoParquetTable.html
>>>
>>> According to the doc, it says "Operator that acts as a sink for queries
>>> on RDDs and can be used to store the output inside a directory of Parquet
>>> files. This operator is similar to Hive's INSERT INTO TABLE operation in
>>> the sense that one can choose to either overwrite or append to a directory.
>>> Note that consecutive insertions to the same table must have compatible
>>> (source) schemas."
>>>
>>> Thanks
>>> Best Regards
>>>
>>>
>>> On Thu, Jul 17, 2014 at 11:42 AM, Hu, Leo <leo.h...@sap.com> wrote:
>>>
>>>>  Hi
>>>>
>>>>    As for spark 1.0, can we insert and update a table with SPARK SQL,
>>>> and how?
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Best Regard
>>>>
>>>
>>>
>

Reply via email to