Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-12 Thread vkulichenko
Correct.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Re-Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8169p8256.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-12 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks Val, this is really really helpful for us. 

This is what you are talking about: 
https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/datagrid/CacheEntryProcessorExample.java,
 right? So in my case, I could set KEYS_SET to be whatever I want, then call 
getValue() for each entry in a particular KEYS_SET to get row object. Then as 
long as the schema of row is preserved, I could only return part of it right? 

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Tracy,

You can use entry processor [1] for this. It sends a closure to the node
where entry is stored and atomically executes it there. The example in the
documentation does the update within the processor and returns null, but you
can do other way around - call getValue(), extract necessary field(s) and
return them. The object you return from the processor will be returned from
the invoke() method on the client.

-Val


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Re-Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8169p8251.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.




Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-12 Thread vkulichenko
Tracy,

You can use entry processor [1] for this. It sends a closure to the node
where entry is stored and atomically executes it there. The example in the
documentation does the update within the processor and returns null, but you
can do other way around - call getValue(), extract necessary field(s) and
return them. The object you return from the processor will be returned from
the invoke() method on the client.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Re-Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8169p8251.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-12 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks Val.

Another crucial question for us is, now we are storing native spark dataframe 
in a format of . We can do get(key) to get certain records. But is 
it possible to fetch only certain column without creating querySqlField?

Tracy

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi,

Quorum consistency is not supported. It's either PRIMARY_SYNC which means
that only primary node is updated synchronously and all backups are updated
asynchronously, or FULL_SYNC where all nodes including all backups are
updated synchronously.

-Val


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Re-Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8169p8240.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.




Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-12 Thread vkulichenko
Hi,

Quorum consistency is not supported. It's either PRIMARY_SYNC which means
that only primary node is updated synchronously and all backups are updated
asynchronously, or FULL_SYNC where all nodes including all backups are
updated synchronously.

-Val



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Re-Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8169p8240.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-11 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks Alexey. Maybe in the future. Right now it's hard. Also from consistency 
point of view, I saw there are FULL_SYNC for strong consistency, but that might 
cause weak availability for writing. Does ignite support quorum consistency?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi Tracy,

In addition to Vlad answer about SQL fields query I would like to mention, that 
may be you can implement your functionality directly in Ignite without Spark?
Ignite has a lot of features: in memory key-value storage, SQL and scan queries 
and Compute engine for map reduce (and much more).
See: https://ignite.apache.org/features.html

On Tue, Oct 11, 2016 at 3:25 PM, Vladislav Pyatkov  wrote:

Hi Tracy,

Ignite support SQLFieldQuery  for the purpose[1]
SQL with default marshaller (Binary) will be use only needed fields when 
evaluation.

[1]: https://apacheignite.readme.io/docs/sql-queries#fields-queries

On Mon, Oct 10, 2016 at 8:54 PM, Tracy Liang (BLOOMBERG/ 731 LEX) 
 wrote:

Thanks for this clear explanation, Alexey. Basically I want to use Ignite as a 
shared in-memory layer among multiple Spark Server instances. Also I have 
another question: does ignite cache support predicate pushdown or a logic view 
of cache? For example, I only want certain column of the value instead of 
returning the entire universe. How do I do that?


From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Tracy,

First of all, cache mode and number of backups could be set only once - on 
cache start.
So, if you know the size of your cluster you could set number of backups before 
cache start.
But, I think it is not reasonable to set number of backups equals to number of 
nodes.
If you need 100% high availability, just use replicated cache. But I would 
recommend to think about how many nodes at once can be lost?
May be it is reasonable to set backups = 2? The more backups you choose - the 
more memory will be consumed by backup partitions and also
grid will spend time in rebalancing data.
What is your use case?


On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) 
 wrote:

Thanks, and PARTITIONED mode could have any number of backups right? I want 
backups for high availability and also my dataset is large. I guess I will use 
PARTITIONED mode and configure number of backups based on actual needs in that 
case right?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi, Tracyl.

Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.

But, why did you need  this?

On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:

As subject shows.


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


-- 
Alexey Kuznetsov


-- 
Alexey Kuznetsov


-- 
Vladislav Pyatkov


-- 
Alexey Kuznetsov




Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-11 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks Vlad. So it's like I define a custom class for my use case(Basically a 
logic view of the table and column) like this: 
https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/model/Person.java.
 And then I create Ignite cache with type  and populate it with 
constructed person object. After that I could directly use something like 
"select * from person where age > 33" to get subset of it right? But issue is 
my original data format is dataframe. If I am doing this way, does that mean I 
have to manually parse the dataframe and use underlying data construct Person 
object? Is this the only way to enable subset/filter pushdown(Could I use spark 
jdbc API that directly do that mapping for me: JdbcUtils.saveTable(personDF, 
url, table, props))? 

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi Tracy,

Ignite support SQLFieldQuery  for the purpose[1]
SQL with default marshaller (Binary) will be use only needed fields when 
evaluation.

[1]: https://apacheignite.readme.io/docs/sql-queries#fields-queries

On Mon, Oct 10, 2016 at 8:54 PM, Tracy Liang (BLOOMBERG/ 731 LEX) 
 wrote:

Thanks for this clear explanation, Alexey. Basically I want to use Ignite as a 
shared in-memory layer among multiple Spark Server instances. Also I have 
another question: does ignite cache support predicate pushdown or a logic view 
of cache? For example, I only want certain column of the value instead of 
returning the entire universe. How do I do that?


From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Tracy,

First of all, cache mode and number of backups could be set only once - on 
cache start.
So, if you know the size of your cluster you could set number of backups before 
cache start.
But, I think it is not reasonable to set number of backups equals to number of 
nodes.
If you need 100% high availability, just use replicated cache. But I would 
recommend to think about how many nodes at once can be lost?
May be it is reasonable to set backups = 2? The more backups you choose - the 
more memory will be consumed by backup partitions and also
grid will spend time in rebalancing data.
What is your use case?


On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) 
 wrote:

Thanks, and PARTITIONED mode could have any number of backups right? I want 
backups for high availability and also my dataset is large. I guess I will use 
PARTITIONED mode and configure number of backups based on actual needs in that 
case right?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi, Tracyl.

Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.

But, why did you need  this?

On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:

As subject shows.


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


-- 
Alexey Kuznetsov


-- 
Alexey Kuznetsov


-- 
Vladislav Pyatkov



Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-11 Thread Alexey Kuznetsov
Hi Tracy,

In addition to Vlad answer about SQL fields query I would like to mention,
that may be you can implement your functionality directly in Ignite without
Spark?
Ignite has a lot of features: in memory key-value storage, SQL and scan
queries and Compute engine for map reduce (and much more).
See: https://ignite.apache.org/features.html

On Tue, Oct 11, 2016 at 3:25 PM, Vladislav Pyatkov 
wrote:

> Hi Tracy,
>
> Ignite support SQLFieldQuery  for the purpose[1]
> SQL with default marshaller (Binary) will be use only needed fields when
> evaluation.
>
> [1]: https://apacheignite.readme.io/docs/sql-queries#fields-queries
>
> On Mon, Oct 10, 2016 at 8:54 PM, Tracy Liang (BLOOMBERG/ 731 LEX) <
> tlian...@bloomberg.net> wrote:
>
>> Thanks for this clear explanation, Alexey. Basically I want to use Ignite
>> as a shared in-memory layer among multiple Spark Server instances. Also I
>> have another question: does ignite cache support predicate pushdown or a
>> logic view of cache? For example, I only want certain column of the value
>> instead of returning the entire universe. How do I do that?
>>
>>
>> From: user@ignite.apache.org
>> Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?
>>
>> Tracy,
>>
>> First of all, cache mode and number of backups could be set only once -
>> on cache start.
>> So, if you know the size of your cluster you could set number of backups
>> before cache start.
>> But, I think it is not reasonable to set number of backups equals to
>> number of nodes.
>> If you need 100% high availability, just use replicated cache. But I
>> would recommend to think about how many nodes at once can be lost?
>> May be it is reasonable to set backups = 2? The more backups you choose -
>> the more memory will be consumed by backup partitions and also
>> grid will spend time in rebalancing data.
>> What is your use case?
>>
>>
>> On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) <
>> tlian...@bloomberg.net> wrote:
>>
>>> Thanks, and PARTITIONED mode could have any number of backups right? I
>>> want backups for high availability and also my dataset is large. I guess I
>>> will use PARTITIONED mode and configure number of backups based on actual
>>> needs in that case right?
>>>
>>> From: user@ignite.apache.org
>>> Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?
>>>
>>> Hi, Tracyl.
>>>
>>> Actually, REPLICATED cache is a PARTITIONED cache win backups on all
>>> nodes.
>>>
>>> But, why did you need  this?
>>>
>>> On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:
>>>
 As subject shows.



 --
 View this message in context: http://apache-ignite-users.705
 18.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-
 and-PARTITIONED-tp8167.html
 Sent from the Apache Ignite Users mailing list archive at Nabble.com.

>>>
>>>
>>>
>>> --
>>> Alexey Kuznetsov
>>>
>>>
>>>
>>
>>
>> --
>> Alexey Kuznetsov
>>
>>
>>
>
>
> --
> Vladislav Pyatkov
>



-- 
Alexey Kuznetsov


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-10 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks for this clear explanation, Alexey. Basically I want to use Ignite as a 
shared in-memory layer among multiple Spark Server instances. Also I have 
another question: does ignite cache support predicate pushdown or a logic view 
of cache? For example, I only want certain column of the value instead of 
returning the entire universe. How do I do that?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Tracy,

First of all, cache mode and number of backups could be set only once - on 
cache start.
So, if you know the size of your cluster you could set number of backups before 
cache start.
But, I think it is not reasonable to set number of backups equals to number of 
nodes.
If you need 100% high availability, just use replicated cache. But I would 
recommend to think about how many nodes at once can be lost?
May be it is reasonable to set backups = 2? The more backups you choose - the 
more memory will be consumed by backup partitions and also
grid will spend time in rebalancing data.
What is your use case?


On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) 
 wrote:

Thanks, and PARTITIONED mode could have any number of backups right? I want 
backups for high availability and also my dataset is large. I guess I will use 
PARTITIONED mode and configure number of backups based on actual needs in that 
case right?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi, Tracyl.

Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.

But, why did you need  this?

On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:

As subject shows.


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


-- 
Alexey Kuznetsov


-- 
Alexey Kuznetsov




Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-10 Thread Alexey Kuznetsov
Tracy,

First of all, cache mode and number of backups could be set only once - on
cache start.
So, if you know the size of your cluster you could set number of backups
before cache start.
But, I think it is not reasonable to set number of backups equals to number
of nodes.
If you need 100% high availability, just use replicated cache. But I would
recommend to think about how many nodes at once can be lost?
May be it is reasonable to set backups = 2? The more backups you choose -
the more memory will be consumed by backup partitions and also
grid will spend time in rebalancing data.
What is your use case?


On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) <
tlian...@bloomberg.net> wrote:

> Thanks, and PARTITIONED mode could have any number of backups right? I
> want backups for high availability and also my dataset is large. I guess I
> will use PARTITIONED mode and configure number of backups based on actual
> needs in that case right?
>
> From: user@ignite.apache.org
> Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?
>
> Hi, Tracyl.
>
> Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.
>
> But, why did you need  this?
>
> On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:
>
>> As subject shows.
>>
>>
>>
>> --
>> View this message in context: http://apache-ignite-users.
>> 70518.x6.nabble.com/Is-it-possible-to-enable-both-
>> REPLICATED-and-PARTITIONED-tp8167.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>
>
> --
> Alexey Kuznetsov
>
>
>


-- 
Alexey Kuznetsov


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-10 Thread Tracy Liang (BLOOMBERG/ 731 LEX)
Thanks, and PARTITIONED mode could have any number of backups right? I want 
backups for high availability and also my dataset is large. I guess I will use 
PARTITIONED mode and configure number of backups based on actual needs in that 
case right?

From: user@ignite.apache.org 
Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED?

Hi, Tracyl.

Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.

But, why did you need  this?

On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:

As subject shows.


--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


-- 
Alexey Kuznetsov




Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-10 Thread Sergey Kozlov
Hi, Tracyl.

Cache mode (REPLICATED or PARTITIONED) is a cache configuration property
and you can have the different caches with different cache modes at the
same time on the running grid.

On Mon, Oct 10, 2016 at 9:02 AM, Alexey Kuznetsov 
wrote:

> Hi, Tracyl.
>
> Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.
>
> But, why did you need  this?
>
> On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:
>
>> As subject shows.
>>
>>
>>
>> --
>> View this message in context: http://apache-ignite-users.705
>> 18.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-
>> and-PARTITIONED-tp8167.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>
>
> --
> Alexey Kuznetsov
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com


Re: Is it possible to enable both REPLICATED and PARTITIONED?

2016-10-10 Thread Alexey Kuznetsov
Hi, Tracyl.

Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes.

But, why did you need  this?

On Mon, Oct 10, 2016 at 10:46 AM, Tracyl  wrote:

> As subject shows.
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Is-it-possible-to-enable-both-
> REPLICATED-and-PARTITIONED-tp8167.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Alexey Kuznetsov