Re: Apache Ignite vs alluxio

Jörn Franke Wed, 23 Nov 2016 00:45:02 -0800

As already said, it is not really a cache use case. Aside, performance tests on 
single nodes simply do not make sense for a distributed system. 
Maybe you can describe in more detail your real use case and we can help you. 
There are many area where you can tune and cache is only one possibility.


> On 23 Nov 2016, at 09:32, vincent gromakowski <[email protected]> 
> wrote:
> 
> Hi,
> As any intermediate, Alluxio or Ignite have overhead which is the time to 
> load backend data in the cache and then read from the cache so there may be 
> use cases that doesn't suite to cache. And I think your very simple job 
> doesn't leverage cache whatever cache you use. In my point of view there are 
> three situations where cache is useful:
> - share data between jobs without writing to backends
> - store spark checkpoints and cache, but only valuable for iterative jobs 
> (not line count)
> - accelerate low performance storage for instance S3 storage or HDFS on 
> shared storage infrastructure (like we often see in poor perf clouds) or even 
> multi DC storage to avoid low perf synchronous I/O
> 
> You aren't in any of these situations so you won't have any performance gain. 
> Additionally cache often comes with constraints, in Ignite case your dataset 
> must fit in memory which for me is a pretty heavy constraint when optimising 
> resources in multi tenant infrastructures. Alluxio may have other constraints.
> 
> 
> 2016-11-23 8:06 GMT+01:00 Kaiming Wan <[email protected]>:
>> Hi,
>> 
>>     Recently, I have used alluxio for about two month. However, it does't
>> satisfy our requirement. We want to use alluxio as a memory cache layer
>> which can easily interagte with Hadoop or Spark to accelerate our spark or
>> map-reduce job. Unfortunately, it doesn't work. I have done many tests on
>> alluxio and find it is hard to use alluxio to accelerate our spark job or
>> map-reduce job. The tested job is only a line count job on a single node.
>> 
>>     I think the main cause is that alluxio is not fully integrated with
>> Hadoop or Spark. I have talked many times with alluxio developer, but they
>> still can't give a solution to solve my problem. I think alluxio is not
>> satisfy our scenario.
>> 
>> 
>>     Can ignite to be used as a memory cache layer which can easily interagte
>> with Hadoop or Spark to accelerate a single local spark or map-reduce job?
>> 
>> 
>>     I will try some tests with ignite. I installed hadoop and spark on a
>> single node. And I will run a line count job. One use ignite and the other
>> not to check whether the performance is much better when using ignite. Hope
>> to know more about the difference between ignite and alluxio. Thanks you~
>> 
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-ignite-users.70518.x6.nabble.com/Apache-Ignite-vs-alluxio-tp9140.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>

Re: Apache Ignite vs alluxio

Reply via email to