Re: kick off a discussion

Konstantin Boudnik Thu, 14 Jul 2016 11:08:32 -0700

On Wed, Jul 13, 2016 at 05:30AM, Dmitriy Setrakyan wrote:
> Hi Alex,
> 
> I believe most of your comments have to do with disk-based functionality,
> especially in regard to backups, snapshots, etc. However, Ignite is
> currently an in-memory system, at least for the nearest future. Let me know
> if I misunderstood something.


And the nearest future is defined by....? This is a collaborative project, as
you all learned during the incubation, and the statements like "the X only
does bar for now" should be consensual. If there's a will to work on the new
functionality which is demanded by the users, and the said functionality is
expected to expand the applicability of the technology - I don't really see
why and how it could be put to hold.

Fortunately, there are a number of ways this development could be put through,
and it doesn't really require much of the moving parts (in fact it is done all
the time in the same way right now): let's put the new development on a
branch, and start moving. There's JIRA and there's the CI to help to validate
and coordinate the work. Sounds like an easy decision to me.

Cos

> On Tue, Jul 12, 2016 at 9:44 PM, Alexandre Boudnik <
> [email protected]> wrote:
> 
> > Dmitriy, thank you for your time and questions, which helped me to
> > realize what I forget to mentioned!
> > See my answers inline; later I'll combine everything together to help
> > to the next readers :)
> >
> > I put together some implementation ideas in Apache Ignite JIRA, as
> > promised: https://issues.apache.org/jira/browse/IGNITE-3457. I see
> > this facility as another CacheStore implementation, so it wouldn't
> > interfere with base principals of Ignite platform.
> >
> >
> > On Mon, Jul 11, 2016 at 1:15 AM, Dmitriy Setrakyan
> > <[email protected]> wrote:
> > > My answers are inline…
> > >
> > > On Sat, Jul 9, 2016 at 3:04 AM, Dmitriy Setrakyan <[email protected]
> > >
> > > wrote:
> > >
> > >> Thanks Sasha!
> > >>
> > >> Resending to the dev list.
> > >>
> > >> D.
> > >>
> > >> On Fri, Jul 8, 2016 at 2:02 PM, Alexandre Boudnik <
> > [email protected]>
> > >> wrote:
> > >>
> > >>> Apache Ignite a great platform but it lacks of certain capabilities,
> > >>> which are common in RDMS world, such as:
> > >>> - Consistent on-line backup for data on entire cluster (or for
> > >>> specified set of caches)
> > >>>
> > >>
> > > I think you mean data center replication here. It is not an easy feature
> > to
> > > implement, and so far has been handled by commercial vendors of Ignite,
> > > e.g. GridGain.
> > >
> > Actually not. Right here I meant exactly what I said: full or
> > incremental backup of all/selected caches in consistent state so it
> > can be used for the purpose of being able to restore them in case of
> > data loss or data corruption. One of important use cases is the OLAP
> > systems (let's say for banking), which has been built on Apache Ignite
> > platform.
> >
> > And you right, data center replication can be easily implemented based
> > on log/snapshot shipment.
> >
> > >
> > >> - Hierarchal snapshots for specified set caches
> > >>>
> > >>
> > > What do you mean by hierarchical?
> > >
> > In this particular case the notion of hierarchical snapshots is very
> > similar to the same notion used in SAN appliances or by Virtual Box or
> > vmware. Using concept of snapshots we can do all this amazing things:
> > - full and incremental backup
> > - restore
> > - rollback to checkpoint
> > - roll forward
> > much easier, with minimal memory and I/O overhead.
> >
> > >
> > >> - Transaction log
> > >>>
> > >>
> > > Why does Ignite need it for in-memory transactions?
> > >
> > At least it is required to provide roll-forward functionality, when
> > you restores the state of the cache from checkpoint (the cache state
> > before snapshot has been made) and then reapply transactions one by
> > one.
> >
> > >
> > >> - Restore cluster state as of certain point in time
> > >>>
> > >>
> > > Given that such restorability may introduce lots of memory overhead, does
> > > it really make sense  for an in-memory cache?
> > >
> > Actually, it will not consume any memory. It will use external memory,
> > such as HDD/SSD space instead. And yes, I think that this
> > functionality makes complete sense for our users IRL, who will love
> > it.
> >
> > >
> > >> - Rolling forward from snapshot with ability to filter/modify
> > transactions
> > >>>
> > >>
> > > Same as above
> > >
> > The same as above: my customers in trenches are begging for that feature.
> >
> > >
> > >> - Asynchronous replication based either on log shipment or snapshot
> > >>> shipment
> > >>> -- Between clusters
> > >>>
> > >>
> > > This is the same as data center replication, no?
> > Including but not limited to: log shipment or snapshot shipment also
> > could be used to implement so called "better-than-lambda-architecture"
> > for BI and OLAP, when data replicated to a query-able datasource let's
> > say Oracle as soon as they are produced by OLTP system. We can use
> > RDBMS API such as Oracle Streams (going to be discontinued - sad) or
> > Golden Gate to filter changes from logs/snapshots and then apply them.
> > That approach allows to save a tons of legacy reports and BI
> > dashboards.
> >
> > >
> > >
> > >> -- Continues data export to let’s say RDMS
> > >>>
> > >>
> > > Don’t we already support it with our write-through feature to a database?
> > >
> > When write-through used for non-local caches it may cause the data
> > corruption in RDBMS: I have opened this issue a few weeks ago:
> > https://issues.apache.org/jira/browse/IGNITE-3321
> >
> > >
> > >> It is also a necessity to reduce cold start time for huge clusters
> > >>> with strict SLAs.
> > >>>
> > >>
> > > What part are you trying to speed up here? Are you talking about loading
> > > data from databases?
> > >
> > I'm talking about the initial load from Persistent Store when cluster
> > has been cold-started (like from GridGain's Local Recoverable Store).
> >
> > >
> > >>
> > >>> I'll put some implementation ideas in JIRA later on. I believe that
> > >>> this list is far from being complete, but I want the community to
> > >>> discuss these abovementioned use cases.
> > >>>
> > >>> --Sasha
> > >>>
> > >>
> > >>
> >

Re: kick off a discussion

Reply via email to