You'll probably have to install it separately.

On Thu, Jul 16, 2015 at 2:29 PM Jem Tucker <jem.tuc...@gmail.com> wrote:

> Hi Vetle,
>
> IndexedRDD is persisted in the same way RDDs are as far as I am aware. Are
> you aware if Cassandra can be built into my application or has to be a
> stand alone database which is installed separately?
>
> Thanks,
>
> Jem
>
> On Thu, Jul 16, 2015 at 12:59 PM Vetle Leinonen-Roeim <ve...@roeim.net>
> wrote:
>
>> Hi,
>>
>> Not sure how IndexedRDD is persisted, but perhaps you're better off using
>> a NOSQL database for lookups (perhaps using Cassandra, with the Cassandra
>> connector)? That should give you good performance on lookups, but
>> persisting those billion records sounds like something that will take some
>> time in any case.
>>
>> Regards,
>> Vetle
>>
>>
>> On Thu, Jul 16, 2015 at 10:02 AM Jem Tucker <jem.tuc...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have been using IndexedRDD as a large lookup (1 billion records) to
>>> join with small tables (1 million rows). The performance of indexedrdd is
>>> great until it has to be persisted on disk. Are there any alternatives to
>>> IndexedRDD or any changes to how I use it to improve performance with big
>>> data volumes?
>>>
>>> Kindest Regards,
>>>
>>> Jem
>>>
>>

Reply via email to