As part of this discussion I want to talk about non-snapshotable indexes in general.
There are few things we can do to improve performance of queries and indexing. 1. We can implement SkipList based index. Probably it will not be much faster than current SnapTree, but GC and Optimizer will feel better. 2. We can implement ConcurrentMap based index as I wrote earlier. For joins on big caches it must give us a good boost. 3. Because these indexes are not snapshotable we can eliminate taking writeLock and for taking snapshots. This writeLock stops concurrent updates of indexes, so on mixed query/update workloads it will be noticeable improvement. 4. Implement off-heap SkipList. I already did that for Hadoop Accelerator (though it was append only), so this code can be taken as a base. 5. Implement offheap ConcurrentMap. Probably we can use something from our current GridOffheapProcessor. The benefit of this is improved performance. The drawback is that all the updates (even partial) will be immediately visible and index search can occur when index is in inconsistent state like when on update the old value was already removed from index but the new one was not added yet. I think items 1-2 can be done in a few days, item 3 can take couple of days as well. Items 4-5 are probably the toughest ones, I think we should postpone them until we will be production ready with 1-3. Sergi 2015-09-19 17:49 GMT+03:00 Dmitriy Setrakyan <[email protected]>: > On Sat, Sep 19, 2015 at 2:04 PM, Sergi Vladykin <[email protected]> > wrote: > > > Not necessary, you can configure to have either sorted index or hash > index > > or both. > > In the last case as far as I understand optimizer just will pick up hash > > index for > > equality conditions because it will have lower cost. > > > > The only thing I'm currently not sure of is how to add this to > > configuration > > (our indexed types config already looks like piece of crap, don't want to > > complicate it even more). > > > > I think index type should be specified at the annotation level. As far as > configuring query metadata in XML, I agree with you, we should clean up the > design. > > > > > > Sergi > > > > 2015-09-18 19:05 GMT+03:00 Dmitriy Setrakyan <[email protected]>: > > > > > Sergi, > > > > > > Does it mean that field "a" will now have 2 indexes, hash and sorted? > > > > > > D. > > > > > > On Fri, Sep 18, 2015 at 4:01 PM, Sergi Vladykin < > > [email protected]> > > > wrote: > > > > > > > Guys, > > > > > > > > It seems that for simple equality queries like > > > > > > > > SELECT * FROM x WHERE a = ? > > > > > > > > it is more effective to use hash indexes which do not even need to be > > > > snapshotable. > > > > I think it will be easy to implement one based on ConcurrentHashMap. > > > > > > > > Thoughts? > > > > > > > > Sergi > > > > > > > > > >
