Re: IEP-22: Direct Data Load proposal
Dima, By "out of question" I meant that 3rd party persistence should work out of the box when IEP-22 is ready. No changes should be required there. As far as persistence vs memory, most probably yes, there might be some differences. Specifically, when data load starts and persistence is enabled, we will bypass free lists and write data to new blocks. This way, overall data will need more pages than when loaded in normal mode. This is a kind of trade-off you face when loading speed is important (at the very least Oracle works this way, most probably other vendors does the same). But this approach may be not applicable for in-memory mode, where total number of pages is limited, and we do not want to hit page eviction. To summarize - some optimizations which are applicable for persistent mode will not be applicable for in-memory. Vladimir. On Thu, Aug 16, 2018 at 11:41 AM Dmitriy Setrakyan wrote: > On Thu, Aug 16, 2018 at 1:24 AM, Vladimir Ozerov > wrote: > > > Hi Denis, > > > > This IEP is mostly about how we work with our own indexes and pages. So > 3rd > > party DB is out of question. > > > > Why? I think 3rd party DB will be supported automatically with CacheStore. > However, do we need to do something different for memory-only vs. > memory+disk? > > D. >
Re: IEP-22: Direct Data Load proposal
On Thu, Aug 16, 2018 at 1:24 AM, Vladimir Ozerov wrote: > Hi Denis, > > This IEP is mostly about how we work with our own indexes and pages. So 3rd > party DB is out of question. > Why? I think 3rd party DB will be supported automatically with CacheStore. However, do we need to do something different for memory-only vs. memory+disk? D.
Re: IEP-22: Direct Data Load proposal
Hi Denis, This IEP is mostly about how we work with our own indexes and pages. So 3rd party DB is out of question. On Thu, Jun 21, 2018 at 10:38 PM Denis Magda wrote: > Vladimir, > > As I see from the IEP, this data loading technique is supposed to be used > for deployments with Ignite persistence enabled. Is it possible to > generalize this solution and use for pure in-memory and in-memory + 3rd > party DB scenarios? > > -- > Denis > > On Wed, Jun 20, 2018 at 8:08 AM Vladimir Ozerov > wrote: > > > Igniters, > > > > Initial data load is one of the most important use cases for our product. > > This is one the first things user try to do with Ignite. And if it takes > > too much time, it is very likely that user will look for other solutions. > > > > We did good progress in this area recently. Specifically - a set of > > internal improvements on our indexes, steaming mode for JDBC driver, COPY > > command. But our internals are still not very efficient - every single > > update goes through the whole set of Ignite components, such as page > cache, > > free-lists, BTrees, etc.. > > > > I created IEP-22 [1]. It's goal is to implement special direct data load > > mode which will bypass our page cache and use alternative algorithm for > > index updates. Together with COPY command and streaming this improvement > > will allow Ignite to load data with very high speed. > > > > Please review the IEP and share your comments. > > > > Vladimir. > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load > > >
Re: IEP-22: Direct Data Load proposal
Vladimir, As I see from the IEP, this data loading technique is supposed to be used for deployments with Ignite persistence enabled. Is it possible to generalize this solution and use for pure in-memory and in-memory + 3rd party DB scenarios? -- Denis On Wed, Jun 20, 2018 at 8:08 AM Vladimir Ozerov wrote: > Igniters, > > Initial data load is one of the most important use cases for our product. > This is one the first things user try to do with Ignite. And if it takes > too much time, it is very likely that user will look for other solutions. > > We did good progress in this area recently. Specifically - a set of > internal improvements on our indexes, steaming mode for JDBC driver, COPY > command. But our internals are still not very efficient - every single > update goes through the whole set of Ignite components, such as page cache, > free-lists, BTrees, etc.. > > I created IEP-22 [1]. It's goal is to implement special direct data load > mode which will bypass our page cache and use alternative algorithm for > index updates. Together with COPY command and streaming this improvement > will allow Ignite to load data with very high speed. > > Please review the IEP and share your comments. > > Vladimir. > > [1] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load >
Re: IEP-22: Direct Data Load proposal
Hi Nikolay, I do not see any problems with TDE for now. On Wed, Jun 20, 2018 at 6:16 PM, Nikolay Izhikov wrote: > Hello, Vladimir. > > Does this IEP fit with IEP-18: TDE? > > Do we allow to user to load data into encrypted cache? > > В Ср, 20/06/2018 в 18:08 +0300, Vladimir Ozerov пишет: > > Igniters, > > > > Initial data load is one of the most important use cases for our product. > > This is one the first things user try to do with Ignite. And if it takes > > too much time, it is very likely that user will look for other solutions. > > > > We did good progress in this area recently. Specifically - a set of > > internal improvements on our indexes, steaming mode for JDBC driver, COPY > > command. But our internals are still not very efficient - every single > > update goes through the whole set of Ignite components, such as page > cache, > > free-lists, BTrees, etc.. > > > > I created IEP-22 [1]. It's goal is to implement special direct data load > > mode which will bypass our page cache and use alternative algorithm for > > index updates. Together with COPY command and streaming this improvement > > will allow Ignite to load data with very high speed. > > > > Please review the IEP and share your comments. > > > > Vladimir. > > > > [1] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP- > 22%3A+Direct+Data+Load >
Re: IEP-22: Direct Data Load proposal
Vladimir, Great IEP, but I couldn't comprehend the beginning of the "Direct Data Load" paragraph. Maybe, there are some typos? ср, 20 июн. 2018 г. в 18:08, Vladimir Ozerov : > Igniters, > > Initial data load is one of the most important use cases for our product. > This is one the first things user try to do with Ignite. And if it takes > too much time, it is very likely that user will look for other solutions. > Best regards, Andrey Kuznetsov.
Re: IEP-22: Direct Data Load proposal
Hello, Vladimir. Does this IEP fit with IEP-18: TDE? Do we allow to user to load data into encrypted cache? В Ср, 20/06/2018 в 18:08 +0300, Vladimir Ozerov пишет: > Igniters, > > Initial data load is one of the most important use cases for our product. > This is one the first things user try to do with Ignite. And if it takes > too much time, it is very likely that user will look for other solutions. > > We did good progress in this area recently. Specifically - a set of > internal improvements on our indexes, steaming mode for JDBC driver, COPY > command. But our internals are still not very efficient - every single > update goes through the whole set of Ignite components, such as page cache, > free-lists, BTrees, etc.. > > I created IEP-22 [1]. It's goal is to implement special direct data load > mode which will bypass our page cache and use alternative algorithm for > index updates. Together with COPY command and streaming this improvement > will allow Ignite to load data with very high speed. > > Please review the IEP and share your comments. > > Vladimir. > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load signature.asc Description: This is a digitally signed message part
IEP-22: Direct Data Load proposal
Igniters, Initial data load is one of the most important use cases for our product. This is one the first things user try to do with Ignite. And if it takes too much time, it is very likely that user will look for other solutions. We did good progress in this area recently. Specifically - a set of internal improvements on our indexes, steaming mode for JDBC driver, COPY command. But our internals are still not very efficient - every single update goes through the whole set of Ignite components, such as page cache, free-lists, BTrees, etc.. I created IEP-22 [1]. It's goal is to implement special direct data load mode which will bypass our page cache and use alternative algorithm for index updates. Together with COPY command and streaming this improvement will allow Ignite to load data with very high speed. Please review the IEP and share your comments. Vladimir. [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load