Barre metal servers with 2 dedicated clusters (spark and Cassandra) versus
1 cluster with colocation. In both case 10 gbps dedicated network.

Le sam. 14 avr. 2018 à 23:17, Mich Talebzadeh <mich.talebza...@gmail.com> a
écrit :

> Thanks Vincent. You mean 20 times improvement with data being local as
> opposed to Spark running on compute nodes?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 14 April 2018 at 21:06, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Not with hadoop but with Cassandra, i have seen 20x data locality
>> improvement on partitioned optimized spark jobs
>>
>> Le sam. 14 avr. 2018 à 21:17, Mich Talebzadeh <mich.talebza...@gmail.com>
>> a écrit :
>>
>>> Hi,
>>>
>>> This is a sort of your mileage varies type question.
>>>
>>> In a classic Hadoop cluster, one has data locality when each node
>>> includes the Spark libraries and HDFS data. this helps certain queries like
>>> interactive BI.
>>>
>>> However running Spark over remote storage say Isilon scaled out NAS
>>> instead of LOCAL HDFS becomes problematic. The full-scan Spark needs to
>>> do will take much longer when it is done over the network (access the
>>> remote Isilon storage) instead of local I/O request to HDFS.
>>>
>>> Has anyone done some comparative studies on this?
>>>
>>>
>>> Thanks
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>
>

Reply via email to