Re: IgniteSet implementation: changes required

2018-09-25 Thread Pavel Pereslegin
Hello Igniters. As was discussed, IgniteSet implementation was based on on-heap data duplication (setDataMap), as a result, the data was not recovered after cluster restart and in the case of large data sets, this led to a significant heap growing and gc pressure. We changed the implementation

Re: IgniteSet implementation: changes required

2018-06-27 Thread Amir Akhmedov
Yes, you are right. Thanks, Amir On Wed, Jun 27, 2018 at 1:15 PM Denis Magda wrote: > Got you. If it's about redundant data duplication in onheap region then no > any concerns from my side. > > Anyway, considering that the data structure will be interacting with the > page memory directly

Re: IgniteSet implementation: changes required

2018-06-27 Thread Denis Magda
Got you. If it's about redundant data duplication in onheap region then no any concerns from my side. Anyway, considering that the data structure will be interacting with the page memory directly then its entries can be stored in Ignite persistence automatically (if the latter is on). Does it

Re: IgniteSet implementation: changes required

2018-06-26 Thread Amir Akhmedov
I also think it will better to remove setDataMap support cause 1. It's making extra pressure on GC by keeping entries on heap 2. It has difficult logic to support with lots of nuances 3. To maintain setDataMap today GridCacheMapEntry calls cctx.dataStructures().onEntryUpdated() on each entry

Re: IgniteSet implementation: changes required

2018-06-26 Thread Anton Vinogradov
Denis, I think that better case is to remove onheap optimisation/duplication. This brings no drop to frequently used operations (put/remove), but even will make it slightly faster. The only one question we have here is "is it possible to restore onheap map in easy way?". Seems that answer is no,

Re: IgniteSet implementation: changes required

2018-06-26 Thread Denis Magda
Anton, Will it be possible to reuse such a functionality for the rest of data structures? I would invest our time in this if all data structures would be able to work with Ignite persistence this way. -- Denis On Tue, Jun 26, 2018 at 1:53 AM Anton Vinogradov wrote: > >> Why don't we read data

Re: IgniteSet implementation: changes required

2018-06-26 Thread Anton Vinogradov
>> Why don't we read data straight from the persistence layer warming RAM up >> in the background? Because it's not a trivial task to finish such loading on unstable topology. That's possible, ofcourse, but solution and complexity will be almost equals to WAL enable/disable. пн, 25 июн. 2018 г. в

Re: IgniteSet implementation: changes required

2018-06-25 Thread Denis Magda
Folks, Why don't we read data straight from the persistence layer warming RAM up in the background? (like we do for SQL and other APIs). If it's a question of time, then I would suggest us not to hurry up and do it in a right way. -- Denis On Mon, Jun 25, 2018 at 6:20 AM Anton Vinogradov

Re: IgniteSet implementation: changes required

2018-06-25 Thread Anton Vinogradov
+1 to removal in case there is no easy, fast and consistent way to restore setDataMap on node restart. I see that we'll gain some performance drop on size() or keys(), but these methods are rarely used. пн, 25 июн. 2018 г. в 16:07, Pavel Pereslegin : > Hello, Igniters. > > I tried to implement

Re: IgniteSet implementation: changes required

2018-06-25 Thread Pavel Pereslegin
Hello, Igniters. I tried to implement IgniteSet data recovery when persistence enabled [1] using trivial cache scanning, however I cannot find optimal way to do that because of the following reasons: - Performing operations on IgniteSet requires completion of data loading (restoring of

Re: IgniteSet implementation: changes required

2018-03-16 Thread Andrey Kuznetsov
Thanks, Dmitry. I agree ultimately, DS API uniformity is a weighty reason. 2018-03-17 3:54 GMT+03:00 Dmitriy Setrakyan : > On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov > wrote: > > > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to

Re: IgniteSet implementation: changes required

2018-03-16 Thread Dmitriy Setrakyan
On Fri, Mar 16, 2018 at 7:39 AM, Andrey Kuznetsov wrote: > Dmitry, your way allows to reuse existing {{Ignite.set()}} API to create > both set flavors. We can adopt it unless somebody in the community objects. > Personally, I like {{IgniteCache.asSet()}} approach proposed by

Re: IgniteSet implementation: changes required

2018-03-16 Thread Andrey Kuznetsov
Dmitry, your way allows to reuse existing {{Ignite.set()}} API to create both set flavors. We can adopt it unless somebody in the community objects. Personally, I like {{IgniteCache.asSet()}} approach proposed by Vladimir O. more, since it emphasizes the difference between sets being created, but

Re: IgniteSet implementation: changes required

2018-03-15 Thread Dmitriy Setrakyan
On Thu, Mar 15, 2018 at 12:24 AM, Andrey Kuznetsov wrote: > Dmitriy, > > It's technically possible to produce both kinds of sets with > {{Ignite.set()}} call, but this will require to one more argument ('small' > vs 'large'). Doesn't it look less inuitive than separate >

Re: IgniteSet implementation: changes required

2018-03-15 Thread Andrey Kuznetsov
Dmitriy, It's technically possible to produce both kinds of sets with {{Ignite.set()}} call, but this will require to one more argument ('small' vs 'large'). Doesn't it look less inuitive than separate {{IgniteCache.asSet()}} ? And of course, we don't want to leave existing implementation

Re: IgniteSet implementation: changes required

2018-03-14 Thread Dmitriy Setrakyan
I am not sure I like the "asSet()" method. We already have Ignite.set(...) method and now introducing yet another one. The new design should fix the existing implementation. We cannot keep the broken implementation around and introduce yet another one. To be consistent, we should also stick to the

Re: IgniteSet implementation: changes required

2018-03-14 Thread Andrey Kuznetsov
Hi, Dmitry. The primary goal of the ticket is to implement {{IgniteCache::asSet}} view. The rationale is introduced earlier in this talk by Vladimir O. I've also mentioned the need to document that new method properly: it should be used to create large sets. Existing {{IgniteSets}} are good to

Re: IgniteSet implementation: changes required

2018-03-14 Thread Pavel Pereslegin
Hello Igniters. I'm working on the implementation of the IgniteCache#asSet method [1] and I think it should return Set (not IgniteSet). Because IgniteSet was introduced mainly to add methods for the collocated version of IgniteSet. Any thoughts? [1]

Re: IgniteSet implementation: changes required

2018-02-27 Thread Andrey Kuznetsov
As far as I know, Pavel P. is working on fixing existing sets currently. As for {{asSet}} cache adapter, I filed the ticket [1]. [1] https://issues.apache.org/jira/browse/IGNITE-7823 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov : > I think the root issue is that we are

Re: IgniteSet implementation: changes required

2018-02-27 Thread Vladimir Ozerov
I think the root issue is that we are trying to mix different cases in a single solution. What is the common usage patterns of sets? 1) Small mostly-read sets - current implementation is ideal for them - everything is available locally, on-heap and in deserialized form 2) Big data sets -

Re: IgniteSet implementation: changes required

2018-02-18 Thread Pavel Pereslegin
Hello Vladimir, > What we can do is to optionally disable on-heap caching for specific set at > the cost of lower performance if user wants so. I want to make sure that we are speaking about the same thing. By "on-heap caching" I mean custom datastructure and not standard on-heap cache, we

Re: IgniteSet implementation: changes required

2018-02-15 Thread Dmitriy Setrakyan
On Thu, Feb 15, 2018 at 6:08 AM, Vladimir Ozerov wrote: > I do not think indexes is the right approach - set do not have indexes, and > you will have to maintain additional counter for it in order to know when > to stop. > > From what I see there are two distinct problems:

Re: IgniteSet implementation: changes required

2018-02-15 Thread Vladimir Ozerov
I do not think indexes is the right approach - set do not have indexes, and you will have to maintain additional counter for it in order to know when to stop. >From what I see there are two distinct problems: 1) Broken recovery - this is just a bug which needs to be fixed. As soon as data is

Re: IgniteSet implementation: changes required

2018-02-14 Thread Pavel Pereslegin
Hello, Igniters! I agree that solution with separate caches is not acceptable for a large number of sets. So, I want to suggest one more way to implement IgniteSet that will introduce element indexes (similar to IgniteQueue). To implement this we can add head/tail indexes to IgniteSet header and

Re: IgniteSet implementation: changes required

2018-02-12 Thread Andrey Kuznetsov
Indeed, all sets, regardless of whether they collocated or not, share single cache, and also use onheap data structures irresistable to checkpointing/recovery. 2018-02-13 2:14 GMT+03:00 Dmitriy Setrakyan : > On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov

Re: IgniteSet implementation: changes required

2018-02-12 Thread Dmitriy Setrakyan
On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov wrote: > Hi all, > > Current set implementation has significant flaw: all set data are > duplicated in onheap maps on _every_ node in order to make iterator() and > size(). For me it looks like simple yet ineffective

Re: IgniteSet implementation: changes required

2018-02-09 Thread Andrey Kuznetsov
Hi all, Current set implementation has significant flaw: all set data are duplicated in onheap maps on _every_ node in order to make iterator() and size(). For me it looks like simple yet ineffective implementation. Currently, these maps are damaged by checkpointing/recovery, and we could patch

Re: IgniteSet implementation: changes required

2018-02-09 Thread Valentin Kulichenko
Pavel, I'm a bit confused. In my understanding, issue exists because we have local in-memory maps which are used as the main source of truth about which structures currently exist. During restart, we lose all this data even if data structures cache(s) are persisted. Once we fix this, issue goes

Re: IgniteSet implementation: changes required

2018-02-09 Thread Dmitriy Setrakyan
Hi Pavel, We have 2 types of data structures, collocated and non-collocated. The difference between them is that the collocated set is generally smaller and will always end up on the same node. Users generally will have many colllocated sets. On the other hand, a non-collocated set can span

Re: IgniteSet implementation: changes required

2018-02-09 Thread Pavel Pereslegin
Hello, Valentin. Thank you for the reply. As mentioned in this conversation, for now we have at least two issues with IgniteSet: 1. Incorrect behavior after recovery from PDS [1]. 2. The data in the cache is duplicated on-heap [2], which is not documented and lead to heap/GC overhead when using

Re: IgniteSet implementation: changes required

2018-02-08 Thread Pavel Pereslegin
Hello, Igniters! We have some issues with current IgniteSet implementation ([1], [2], [3], [4]). As was already described in this conversation, the main problem is that current IgniteSet implementation maintains plain Java sets on every node (see CacheDataStructuresManager.setDataMap). These

Re: IgniteSet implementation: changes required

2017-10-30 Thread Dmitriy Setrakyan
Hi Andrey, Thanks for a detailed email. I think your suggestions do make sense. Ignite cannot afford to have a distributed set that is not fail-safe. Can you please focus only on solutions that provide consistent behavior in case of topology changes and failures and document them in the ticket?

IgniteSet implementation: changes required

2017-10-30 Thread Andrey Kuznetsov
Hi, Igniters! Current implementation of IgniteSet is fragile with respect to cluster recovery from a checkpoint. We have an issue (IGNITE-5553) that addresses set's size() behavior, but the problem is slightly broader. The text below is my comment from Jira issue. I encourage you to discuss it.