Hello!

Can we also bypass WAL for such mode automatically?

However, we will definitely need a 'normal' mode of DataStreamer operation,
for people who use dataStreamer with custom stream transformers on existing
data in use.

Regards,

-- 
Ilya Kasnacheev

2018-07-14 12:33 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>:

> Igniters,
>
> Denis is right - please pay attention to IEP-22, as this is how we are
> going to load data into the grid in future. Note that current data streamer
> internals are not efficient enough, primarily because it has to interact
> with page memory, free lists and various BTree's in regular manner. I think
> that when IEP-22 is implemented, it will be integrated with data streamer
> tightly, and the most defautl way to load data would be:
> 1) Obtain exclusive table lock
> 2) Load data bypassing almost all Ignite internals
> 3) Re-build indexes
> 4) Release the lock
>
> Normally all types of data load should obey transactional semantics if MVCC
> is enabled, and we should think separately on how to do that for
> continuous-streaming case.
>
> For now let's focus on immediate goal for MVCC release - data streamer
> should work, no new abstractions or APIs should be introduced. The easiest
> way to do this is to agree that streamer is not transactional and use
> special version as Igor proposed. In future releases, when IEP-22 is
> implemented, it become transactional with help of exclusive table lock. In
> more distant releases we will think about separate optimizations for
> continuous streaming and possibly other cases.
>
> Makes sense?
>
> Vladimir.
>
>
> On Fri, Jul 13, 2018 at 11:30 PM Denis Magda <dma...@apache.org> wrote:
>
> > Agree that initial loading and real-time streaming should be seen as
> > different use cases.
> >
> > For the loading part, I would borrow ideas from direct data load IEP [1].
> > Ignite should assume that no app works with the cluster until it's
> > preloaded. So, no global locks or things like that. Just fasten a seat
> belt
> > and feed data to your nodes.
> >
> > For the streaming part, I would consider 2 or 3 proposed by Igor.
> >
> > --
> > Denis
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> 22%3A+Direct+Data+Load
> >
> > On Fri, Jul 13, 2018 at 10:03 AM Seliverstov Igor <gvvinbl...@gmail.com>
> > wrote:
> >
> > > Ivan,
> > >
> > > Anyway DataStreamer is the fastest way to deliver data to a data node,
> > the
> > > question is how to apply it correctly.
> > >
> > > I don’t thing we need one more tool, which 90% is the same as
> > DataStreamer.
> > >
> > > All we need is just to implement a couple of new stream receivers.
> > >
> > > Regards,
> > > Igor
> > >
> > > > 13 июля 2018 г., в 9:56, Павлухин Иван <vololo...@gmail.com>
> > написал(а):
> > > >
> > > > Hi Igniters,
> > > >
> > > > I had a look into IgniteDataStreamer. As far as I understand,
> currently
> > > it
> > > > just works incorrectly for MVCC tables. It appears as a blocker for
> > > > releasing MVCC. The simplest thing is to refuse creating streamer for
> > > MVCC
> > > > tables.
> > > >
> > > > Next step could be hair splitting of related use cases. For me,
> initial
> > > > load and continuous streaming look quite different cases and it is
> > better
> > > > to keep them separate at least at API level. Perhaps, it is better to
> > > > separate API basing on user experience. For example, DataStreamer
> could
> > > be
> > > > considered tool without surprises (which means leaving data always
> > > > consistent, transactions). And let's say BulkLoader is a beast for
> > > fastest
> > > > data loading but full of surprises. Such surprises could be locking
> > > tables,
> > > > rolling back user transactions and so on. So, it is of very limited
> use
> > > > (like initial load). Keeping API entities separate looks better for
> me
> > > than
> > > > introducing multiple modes, because separated entities are easier for
> > > > understanding and so less prone to user mistakes.
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > >
> > >
> >
>

Reply via email to