[GitHub] ignite pull request #3701: IGNITE-8052: Clear error message when using a non...
GitHub user shroman opened a pull request: https://github.com/apache/ignite/pull/3701 IGNITE-8052: Clear error message when using a non-existing column nam⦠â¦e for CREATE TABLE primary key. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shroman/ignite IGNITE-8052 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3701.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3701 commit b857c11a6643aa2162a9621d52cf5e67210ec1f9 Author: shromanDate: 2018-03-27T05:26:14Z IGNITE-8052: Clear error message when using a non-existing column name for CREATE TABLE primary key. ---
[jira] [Created] (IGNITE-8052) Clear error message when using a non-existing column name for CREATE TABLE primary key
Roman Shtykh created IGNITE-8052: Summary: Clear error message when using a non-existing column name for CREATE TABLE primary key Key: IGNITE-8052 URL: https://issues.apache.org/jira/browse/IGNITE-8052 Project: Ignite Issue Type: Improvement Components: sql Reporter: Roman Shtykh Assignee: Roman Shtykh On _CREATE TABLE_ with a misspelled column name for _PRIMARY KEY_ we have the following error with assertions enabled {code:java} java.lang.AssertionError at org.apache.ignite.internal.processors.query.h2.sql.GridSqlQueryParser.parseCreateTable(GridSqlQueryParser.java:1044) at org.apache.ignite.internal.processors.query.h2.sql.GridSqlQueryParser.parse(GridSqlQueryParser.java:1647) at org.apache.ignite.internal.processors.query.h2.ddl.DdlStatementsProcessor.runDdlStatement(DdlStatementsProcessor.java:245) ... {code} and when disabled {code:java} class org.apache.ignite.internal.processors.query.IgniteSQLException: null at org.apache.ignite.internal.processors.query.h2.ddl.DdlStatementsProcessor.runDdlStatement(DdlStatementsProcessor.java:492) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.doRunPrepared(IgniteH2Indexing.java:1643) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.querySqlFields(IgniteH2Indexing.java:1577) ... {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: What's about releasing Ignite 2.5 a bit earlier?
Good, hope that everyone interested has enough time to share an opinion. As a summary of this discussion, the community decided to release Ignite 2.5 on April 30. Most of the changes and features are already in the master. Who is ready to become a release manager of 2.5? We need to prepare a respective wiki page and outline the milestones (code freeze, QA, vote, release). -- Denis On Sat, Mar 24, 2018 at 4:38 PM, Yury Babakwrote: > Hi, > > We already implemented LSQR for linear regression and SVM(support vector > machine) algorithms. Also we implement new distributed Datasets. > > And we want to adapt all our algorithms to this new dataset API. So from my > point of view we have enough time for those tasks. > > So I think that we could release Apache Ignite 2.5 at 30 Apr. > > Regards, > Yury > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ >
Re: Optimize GridLongList serialization
Thanks, Alex. GridGain automatically compresses all the internal types. Somehow it looks like the GridLongList may have been mixed. Can you please file a ticket for 2.5 release? D. On Mon, Mar 26, 2018 at 4:55 AM, Александр Меньшиковwrote: > I investigated network loading and found that a big part of internal data > inside messages is `GridLongList`. > It is a part of `GridDhtTxFinishRequest`, > `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`, > `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`. > > So I think it has the sense to optimize `GridLongList` serialization. > > > Here we serialize all elements and don't take into account `idx` value: > > ``` > > @Override public boolean writeTo(ByteBuffer buf, MessageWriter writer) { > > writer.setBuffer(buf); > > > > if (!writer.isHeaderWritten()) { > > if (!writer.writeHeader(directType(), fieldsCount())) > > return false; > > > > writer.onHeaderWritten(); > > } > > > > switch (writer.state()) { > > case 0: > > if (!writer.writeLongArray("arr", arr)) > > return false; > > > > writer.incrementState(); > > > > case 1: > > if (!writer.writeInt("idx", idx)) > > return false; > > > > writer.incrementState(); > > > > } > > > > return true; > > } > > ``` > > > > Which is not happening in another serialization method in the same class: > > > > ``` > > public static void writeTo(DataOutput out, @Nullable GridLongList list) > throws IOException { > > out.writeInt(list != null ? list.idx : -1); > > > > if (list != null) { > > for (int i = 0; i < list.idx; i++) > > out.writeLong(list.arr[i]); > > } > > } > > ``` > > > So, we can simply reduce messages size by sending only a valuable part of > the array. > If you don't mind I will create an issue in Jira for this. > > > By the way, `long` is a huge type. As I see in most cases `GridLongList` > uses for counters. > And I have checked the possibility of compress `long` into smaller types as > `int`, `short` or `byte` in test > `GridCacheInterceptorAtomicRebalanceTest` (took it by random). > And found out that all `long` in`GridLongList` can be cast to `int` and 70% > of them to shorts. > Such conversion is quite fast about 1.1 (ns) per element (I have checked it > by JMH test). > > > > Of course, there are a lot of ways to compress data, > but I know proprietary GridGain plug-in has different `MessageWriter` > implementation. > So maybe it is unnecessary and some compression already exists in this > proprietary plug-in. > Does someone know something about it? >
Re: Data compression design proposal
AG, I would also ask about the compression itself. How and where do we store the compression meta information? We cannot be compressing every page separately, it will not be effective. However, if we try to store the compression metadata, how do we make other nodes aware of it? Has this been discussed? D. On Mon, Mar 26, 2018 at 8:53 AM, Alexey Goncharuk < alexey.goncha...@gmail.com> wrote: > Guys, > > How does this fit the PageMemory concept? Currently it assumes that the > size of the page in memory and the size of the page on disk is the same, so > only per-entry level compression within a page makes sense. > > If you compress a whole page, how do you calculate the page offset in the > target data file? > > --AG > > 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov: > > > Gents, > > > > If I understood the idea correctly, the proposal is to compress pages on > > eviction and decompress them on read from disk. Is it correct? > > > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov wrote: > > > > > + 1 to Taras's vision. > > > > > > Compression on eviction is a good case to store more. > > > Pages at memory always hot a real system, so complession in memory will > > > definetely slowdown the system, I think. > > > > > > Anyway, we can split issue to "on eviction compression" and to > "in-memory > > > compression". > > > > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > > > Hi, > > > > > > > > I guess page level compression make sense on page loading / eviction. > > > > In this case we can decrease I/O operation and performance boost can > be > > > > reached. > > > > What is goal for in-memory compression? Holds about 2-5x data in > memory > > > > with performance drop? > > > > > > > > Also please clarify the case with compression/decompression for hot > and > > > > cold pages. > > > > Is it right for your approach: > > > > 1. Hot pages are always decompressed in memory because many > read/write > > > > operations touch ones. > > > > 2. So we can compress only cold pages. > > > > > > > > So the way is suitable when the hot data size << available RAM size. > > > > > > > > Thoughts? > > > > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > > > >> Hi Igniters! > > > >> > > > >> I’d like to do next step in our data compression discussion [1]. > > > >> > > > >> Most Igniters vote for per-data-page compression. > > > >> > > > >> I’d like to accumulate main theses to start implementation: > > > >> - page will be compressed with the dictionary-based approach > (e.g.LZV) > > > >> - page will be compressed in batch mode (not on every change) > > > >> - page compression should been initiated by an event, for example, a > > > >> page’s free space drops below 20% > > > >> - compression process will be under page write lock > > > >> > > > >> Vladimir Ozerov has written: > > > >> > > > >>> What we do not understand yet: > > > 1) Granularity of compression algorithm. > > > 1.1) It could be per-entry - i.e. we compress the whole entry > > content, > > > but > > > respect boundaries between entries. E.g.: before - > > [ENTRY_1][ENTRY_2], > > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > > v1.2) Or it could be per-field - i.e. we compress fields, but > > respect > > > binary > > > object layout. First approach is simple, straightforward, and will > > > give > > > acceptable compression rate, but we will have to compress the > whole > > > binary > > > object on every field access, what may ruin our SQL performance. > > > Second > > > approach is more complex, we are not sure about it's compression > > rate, > > > but > > > as BinaryObject structure is preserved, we will still have fast > > > constant-time per-field access. > > > > > > >>> I think there are advantages in both approaches and we will be able > > to > > > >> compare different approaches and algorithms after prototype > > > >> implementation. > > > >> > > > >> Main approach in brief: > > > >> 1) When page’s free space drops below 20% will be triggered > > compression > > > >> event > > > >> 2) Page will be locked by write lock > > > >> 3) Page will be passed to page’s compressor implementation > > > >> 4) Page will be replaced by compressed page > > > >> > > > >> Whole object or a field reading: > > > >> 1) If page marked as compressed then the page will be handled by > > > >> page’s compressor implementation, otherwise, it will be handled as > > > >> usual. > > > >> > > > >> Thoughts? > > > >> > > > >> Should we create new IEP and register tickets to start > implementation? > > > >> This will allow us to watch for the feature progress and related > > > >> tasks. > > > >> > > > >> > > > >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data- > > > >> compression-in-Ignite-tc20679.html > > > >> > > > >> > > >
Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration
Taras, Please document the authentication part on a 2.5 version of the binary protocol page on readme.io. We need to share it with the guys who are developing Node.JS client right now. -- Denis On Mon, Mar 26, 2018 at 2:55 AM, Taras Ledkovwrote: > I had implement user credential store according with the previous > discussion about user authentication. > > Now JDBC thin, and ODBC support user authentication. We haven't > implemented it for all thin client because protocols are not similar. > > I see two ways to implement authentication for thin client protocol: > - You implement authentication on server side and at the .NET; > - When java thin client [1] is merged I implement authentication for thin > protocol & for java thin client. > > I'll add documentation for user authentication ASAP. Please feel free to > contact if you need more info till documentation is added. > > [1]. https://issues.apache.org/jira/browse/IGNITE-7421 > > > On 26.03.2018 9:56, Pavel Tupitsyn wrote: > >> I've started this task, and the property name combined with lack of >> javadoc >> seems confusing and misleading: >> >> * Turns out this authentication is only for thin clients >> * Not clear how to configure and use it, even after digging through Jira >> and devlist >> >> How do I write test to ensure it works? >> >> Thanks, >> Pavel >> >> On Fri, Mar 23, 2018 at 6:44 PM, Pavel Tupitsyn >> wrote: >> >> Thanks, got it, will do. >>> >>> On Fri, Mar 23, 2018 at 4:36 PM, Dmitry Pavlov >>> wrote: >>> >>> Hi Pavel, Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436 Sincerely, Dmitriy Pavlov пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn : Please provide description in IGNITE-8034 and link Java-side ticket > there. > On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn > wrote: > > Hi Vladimir, >> >> Can you provide more details? >> * What does it do? >> * Do we need to only propagate the flag to .NET or do anything else? >> * Related ticket? >> >> Thanks, >> Pavel >> >> On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov < >> > voze...@gridgain.com> > wrote: >> >> Pavel, >>> >>> We introduced new flag IgniteConfiguration.authenticationEnabled >>> >> recently. > >> Would you mind adding it to IgniteConfigutation.cs [1]? >>> >>> Vladimir. >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-8034 >>> >>> >> >>> > -- > Taras Ledkov > Mail-To: tled...@gridgain.com > >
Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Ivan, It's all good then :) Thanks! -Val On Mon, Mar 26, 2018 at 1:50 AM, Ivan Rakovwrote: > Val, > > There's no any sense to use WalMode.NONE in production environment, it's > kept for testing and debugging purposes (including possible user activities > like capacity planning). > We already print a warning at node start in case WalMode.NONE is set: > > U.quietAndWarn(log,"Started write-ahead log manager in NONE mode, >> persisted data may be lost in " + >> "a case of unexpected node failure. Make sure to deactivate the >> cluster before shutdown."); >> > > Best Regards, > Ivan Rakov > > > On 24.03.2018 1:40, Valentin Kulichenko wrote: > >> Dmitry, >> >> Thanks for clarification. So it sounds like if we fix all other modes as >> we >> discuss here, NONE would be the only one allowing corruption. I also don't >> see much sense in this and I think we should clearly state this in the >> doc, >> as well print out a warning if NONE mode is used. Eventually, if it's >> confirmed that there are no reasonable use cases for it, we can deprecate >> it. >> >> -Val >> >> On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlov >> wrote: >> >> Hi Val, >>> >>> NONE means that the WAL log is disabled and not written at all. Use of >>> the >>> mode is at your own risk. It is possible that restore state after the >>> crash >>> at the middle of checkpoint will not succeed. I do not see much sence in >>> it, especially in production. >>> >>> BACKGROUND is full functional WAL mode, but allows some delay before >>> flush >>> to disk. >>> >>> Sincerely, >>> Dmitriy Pavlov >>> >>> сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko < >>> valentin.kuliche...@gmail.com>: >>> >>> I agree. In my view, any possibility to get a corrupted storage is a bug which needs to be fixed. BTW, can someone explain semantics of NONE mode? What is the difference from BACKGROUND from user's perspective? Is there any particular use case where it can be used? -Val On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov wrote: Hi Ivan, > > IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree? > > Sincerely, > Dmitriy Pavlov > > пт, 23 мар. 2018 г. в 12:23, Ivan Rakov : > > Igniters, there's another important question about this matter. >> Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that >> > we >>> have to do it: it will cause similar performance drop, but if we >> consider LOG_ONLY broken without these fixes, BACKGROUND is broken as >> > well. > >> Best Regards, >> Ivan Rakov >> >> On 23.03.2018 10:27, Ivan Rakov wrote: >> >>> Fixes are quite simple. >>> I expect them to be merged in master in a week in worst case. >>> >>> Best Regards, >>> Ivan Rakov >>> >>> On 22.03.2018 17:49, Denis Magda wrote: >>> Ivan, How quick are you going to merge the fix into the master? Many persistence related optimizations have already stacked up. Probably, we can >>> release > >> them sooner if the community agrees. -- Denis On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov < >>> ivan.glu...@gmail.com> >>> wrote: Thanks all! > We seem to have reached a consensus on this issue. I'll just add > necessary > fsyncs under IGNITE-7754. > > Best Regards, > Ivan Rakov > > > On 22.03.2018 15:13, Ilya Lantukh wrote: > > +1 for fixing LOG_ONLY. If current implementation doesn't >> > protect >>> from > >> data >> corruption, it doesn't make sence. >> >> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda < >> > dma...@apache.org> >>> wrote: >> >> +1 for the fix of LOG_ONLY >> >>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk < >>> alexey.goncha...@gmail.com> wrote: >>> >>> +1 for fixing LOG_ONLY to enforce corruption safety given the >>> provided >>> performance results. 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov < >>> voze...@gridgain.com > : >> >>> +1 for accepting drop in LOG_ONLY. 7% is not that much and >>> not a >>> drop at all, provided that we fixing a bug. I.e. should we implement >>> it >>> correctly in the first place we would never notice any "drop". > I do not understand why someone would like to use current > broken > mode. > > On Wed, Mar 21, 2018 at 6:11 PM,
Re: Different behavior when saving date from Dataframe API and RDD API
Hell, ray. > Please advise is this behavior expected? I think this behavior is expected. Because it's more efficient to query specific affinity key value. Anyway, I'm not an expert in SQL engine, so I send your question to the dev-list. Igniters, I think this user question is related to the SQL engine, not the Data Frame integration. So can some of SQL engine experts take a look. In the first case SQL table will be created as follows `CREATE TABLE table_name() PRIMARY KEY a,b,c,d WITH "template=partitioned,affinitykey=a"` В Пт, 23/03/2018 в 00:48 -0700, Ray пишет: > I was trying out one of Ignite 2.4's new features - saving data from > dataframe. > But I found some inconsistency between the Dataframe API and RDD API. > > This is the code from saving dataframe to Ignite. > DF.write > .format(FORMAT_IGNITE) > .mode(SaveMode.Append) > .option(OPTION_CONFIG_FILE, CONFIG) > .option(OPTION_TABLE, "table_name") > .option(OPTION_CREATE_TABLE_PRIMARY_KEY_FIELDS, "a,b,c,d") > .option(OPTION_CREATE_TABLE_PARAMETERS, > "template=partitioned,affinitykey=a") > .option(OPTION_STREAMER_ALLOW_OVERWRITE, "true") > .save() > After data finished saving, I ran this command to create an index on field > a. > CREATE INDEX IF NOT EXISTS idx ON table_name (a); > Then I run this query to see if the index is working. > > explain select a from table_name where a = '303'; > PLAN SELECT > __Z0.a AS __C0_0 > FROM PUBLIC.table_name __Z0 > /* PUBLIC.AFFINITY_KEY: a = '303' */ > WHERE __Z0.a = '303' > > But when I try query the data I insert in the old RDD way, the result is > explain select a from table_name where a = '303'; > PLAN SELECT > __Z0.a AS __C0_0 > FROM PUBLIC.table_name __Z0 > /* PUBLIC.table_name_IDX: a = '303' */WHERE __Z0.a = '303' > > The result shows with affinity key, the index created is not effective. > I tried creating index on other non affinity key field, the index is > working. > Please advise is this behavior expected? > > THanks > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ signature.asc Description: This is a digitally signed message part
Re: Transparent Data Encryption (TDE) in Apache Ignite
Hi! > As far as I remember to be PCI-DSS compliant it is sufficient to use > encryption at file system level. But it needs to be double-checked. It > requires encrypt transmission of cardholder data across open, public > networks. Could you point me where does it require DB data to be encrypted? PCI DSS [1] have 12 requirements, but you are asking about "Requirement 3: Protect stored cardholder data". It's description doesn't say anything about way of protection. But there is point 3.4.1 which says that we can use any way to protect data: "If disk encryption is used (rather than file- or column-level database encryption), logical access must be managed separately and independently of native operating system authentication and access control mechanisms (for example, by not using local user account databases or general network login credentials). Decryption keys must not be associated with user accounts. Note: This requirement applies in addition to all other PCI DSS encryption and key-management requirements." > Joined node should be activated (included to baseline) by activation > request contains node-password. Agree, this is safer. > Any reason to keed data crypted in memory? Only if we want to protect data from something like spectre and meltdown, but I think they are trouble of hardware, but not software. > PCI DSS require this? PCI DSS requires to "protect stored data". That means we *can* encrypt data to keep it safe in memory, but we don't need it. [1] https://www.pcisecuritystandards.org/document_library?category=pcidss=pci_dss 2018-03-26 19:57 GMT+03:00 Anton Vinogradov: > Folks, > > I've checked presentation. > > 1) It's a bad idea to allow automatic node join (sending decripted cache's > keys on join). > > Each node join should be allowed by administrator. > We have to use two-step verification in that case. > - admitistrator set keystore password for each node > - another administrator use this password on node join. > > My vision is: > - each node should keep master key in keystore > - each keystore should have *own* keystore password > - on cluster activation we have to specify list of pairs node-password. > This will provide us guarantee that only allowed nodes are in cluster. > - TDE should be available only in case baseline used. > Joined node should be activated (included to baseline) by activation > request contains node-password. > > So, on initial activation or BLT join, each node will gain keystore > password and encrypted cache's passwords. > This case will guarantee data safe even on SSL issues. > > 2) Any reason to keed data crypted in memory? PCI DSS require this? > Data shoud be crypted on eviction and rebalancing, I think. > In that case we can implement TDE and data compression in same way. > > Thoughts? > > > 2018-03-12 17:58 GMT+03:00 Denis Magda : > > > Nikolay, please try on more time. > > > > -- > > Denis > > > > On Sun, Mar 11, 2018 at 11:20 PM, Nikolay Izhikov > > wrote: > > > > > Hello, Denis. > > > > > > Did you give me the permissions? > > > Seems, I still can't create IEP on the IGNITE Wiki. > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/Active+Proposals > > > > > > В Вт, 06/03/2018 в 08:55 +0300, Nikolay Izhikov пишет: > > > > Thank you, it's - nizhikov > > > > > > > > В Пн, 05/03/2018 в 15:09 -0800, Denis Magda пишет: > > > > > Nikolay, what's your Wiki ID? I'll grant you required permissions. > > > > > > > > > > -- > > > > > Denis > > > > > > > > > > On Sun, Mar 4, 2018 at 11:00 PM, Nikolay Izhikov < > > nizhi...@apache.org> > > > wrote: > > > > > > Hello, Denis. > > > > > > > > > > > > > I would encourage you creating an IEP > > > > > > > > > > > > That is exactly what we want to do :) > > > > > > > > > > > > But seems I have not sufficient privileges to do it on Ignite > wiki. > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/ > > Active+Proposals > > > > > > > > > > > > Can you or someone give me such rights? > > > > > > > > > > > > В Чт, 01/03/2018 в 22:23 -0800, Denis Magda пишет: > > > > > > > Dmitriy R., Nilokay, > > > > > > > > > > > > > > Thanks for the analysis and handout of the architectural > design. > > > No doubts, > > > > > > > it would be a valuable addition to Ignite. > > > > > > > > > > > > > > I would encourage you creating an IEP on the wiki and break the > > > work into > > > > > > > pieces discussing specific part with the community. > > > > > > > > > > > > > > -- > > > > > > > Denis > > > > > > > > > > > > > > > > > > > > > On Thu, Mar 1, 2018 at 9:29 PM, Nikolay Izhikov < > > > nizhi...@apache.org> wrote: > > > > > > > > > > > > > > > Hello, Dmitriy. > > > > > > > > > > > > > > > > Thank you for feedback! > > > > > > > > > > > > > > > > > Will it be supported? > > > > > > > > > > > > > > > > Yes. > > > > > > > > > > > > > > > > TDE shouldn't broke any of existing Ignite features. > > > > > > > > It adds some encrypt/decrypt
Re: Rebalancing - how to make it faster
>> It is impossible to disable WAL only for certain partitions without >> completely overhauling design of Ignite storage mechanism. Right now we can >> afford only to change WAL mode per cache group. Cache group rebalancing is a one cache rebalancing, and then this cache ("cache group") can be presented as a set of virtual caches. So, there is no issues for initial rebalancing. Lets disable WAL on initial rebalancing. 2018-03-26 16:46 GMT+03:00 Ilya Lantukh: > Dmitry, > It is impossible to disable WAL only for certain partitions without > completely overhauling design of Ignite storage mechanism. Right now we can > afford only to change WAL mode per cache group. > > The idea is to disable WAL when node doesn't have any partition in OWNING > state, which means it doesn't have any consistent data and won't be able to > restore from WAL anyway. I don't see any potential use for WAL on such > node, but we can keep a configurable parameter indicating can we > automatically disable WAL in such case or not. > > On Fri, Mar 23, 2018 at 10:40 PM, Dmitry Pavlov > wrote: > > > Denis, as I understood, there is and idea to exclude only rebalanced > > partition(s) data. All other data will go to the WAL. > > > > Ilya, please correct me if I'm wrong. > > > > пт, 23 мар. 2018 г. в 22:15, Denis Magda : > > > > > Ilya, > > > > > > That's a decent boost (5-20%) even having WAL enabled. Not sure that we > > > should stake on the WAL "off" mode here because if the whole cluster > goes > > > down, it's then the data consistency is questionable. As an architect, > I > > > wouldn't disable WAL for the sake of rebalancing; it's too risky. > > > > > > If you agree, then let's create the IEP. This way it will be easier to > > > track this endeavor. BTW, are you already ready to release any > > > optimizations in 2.5 that is being discussed in a separate thread? > > > > > > -- > > > Denis > > > > > > > > > > > > On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh > > > wrote: > > > > > > > Denis, > > > > > > > > > - Don't you want to aggregate the tickets under an IEP? > > > > Yes, I think so. > > > > > > > > > - Does it mean we're going to update our B+Tree implementation? Any > > > ideas > > > > how risky it is? > > > > One of tickets that I created ( > > > > https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree > > > > modification, but I am not planning to do it in the nearest future. > It > > > > shouldn't affect existing tree operations, only introduce new ones > > > (putAll, > > > > invokeAll, removeAll). > > > > > > > > > - Any chance you had a prototype that shows performance > optimizations > > > the > > > > approach you are suggesting to take? > > > > I have a prototype for simplest improvements ( > > https://issues.apache.org/ > > > > jira/browse/IGNITE-8019 & https://issues.apache.org/ > > > > jira/browse/IGNITE-8018) > > > > - together they increase throughput by 5-20%, depending on > > configuration > > > > and environment. Also, I've tested different WAL modes - switching > from > > > > LOG_ONLY to NONE gives over 100% boost - this is what I expect from > > > > https://issues.apache.org/jira/browse/IGNITE-8017. > > > > > > > > On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda > > wrote: > > > > > > > > > Ilya, > > > > > > > > > > That's outstanding research and summary. Thanks for spending your > > time > > > on > > > > > this. > > > > > > > > > > Not sure I have enough expertise to challenge your approach, but it > > > > sounds > > > > > 100% reasonable to me. As side notes: > > > > > > > > > >- Don't you want to aggregate the tickets under an IEP? > > > > >- Does it mean we're going to update our B+Tree implementation? > > Any > > > > >ideas how risky it is? > > > > >- Any chance you had a prototype that shows performance > > > optimizations > > > > of > > > > >the approach you are suggesting to take? > > > > > > > > > > -- > > > > > Denis > > > > > > > > > > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh < > ilant...@gridgain.com > > > > > > > > wrote: > > > > > > > > > > > Igniters, > > > > > > > > > > > > I've spent some time analyzing performance of rebalancing > process. > > > The > > > > > > initial goal was to understand, what limits it's throughput, > > because > > > it > > > > > is > > > > > > significantly slower than network and storage device can > > > theoretically > > > > > > handle. > > > > > > > > > > > > Turns out, our current implementation has a number of issues > caused > > > by > > > > a > > > > > > single fundamental problem. > > > > > > > > > > > > During rebalance data is sent in batches called > > > > > > GridDhtPartitionSupplyMessages. Batch size is configurable, > > default > > > > > value > > > > > > is 512KB, which could mean thousands of key-value pairs. However, > > we > > > > > don't > > > > > > take any advantage over this fact and process each
Re: Transparent Data Encryption (TDE) in Apache Ignite
Folks, I've checked presentation. 1) It's a bad idea to allow automatic node join (sending decripted cache's keys on join). Each node join should be allowed by administrator. We have to use two-step verification in that case. - admitistrator set keystore password for each node - another administrator use this password on node join. My vision is: - each node should keep master key in keystore - each keystore should have *own* keystore password - on cluster activation we have to specify list of pairs node-password. This will provide us guarantee that only allowed nodes are in cluster. - TDE should be available only in case baseline used. Joined node should be activated (included to baseline) by activation request contains node-password. So, on initial activation or BLT join, each node will gain keystore password and encrypted cache's passwords. This case will guarantee data safe even on SSL issues. 2) Any reason to keed data crypted in memory? PCI DSS require this? Data shoud be crypted on eviction and rebalancing, I think. In that case we can implement TDE and data compression in same way. Thoughts? 2018-03-12 17:58 GMT+03:00 Denis Magda: > Nikolay, please try on more time. > > -- > Denis > > On Sun, Mar 11, 2018 at 11:20 PM, Nikolay Izhikov > wrote: > > > Hello, Denis. > > > > Did you give me the permissions? > > Seems, I still can't create IEP on the IGNITE Wiki. > > > > https://cwiki.apache.org/confluence/display/IGNITE/Active+Proposals > > > > В Вт, 06/03/2018 в 08:55 +0300, Nikolay Izhikov пишет: > > > Thank you, it's - nizhikov > > > > > > В Пн, 05/03/2018 в 15:09 -0800, Denis Magda пишет: > > > > Nikolay, what's your Wiki ID? I'll grant you required permissions. > > > > > > > > -- > > > > Denis > > > > > > > > On Sun, Mar 4, 2018 at 11:00 PM, Nikolay Izhikov < > nizhi...@apache.org> > > wrote: > > > > > Hello, Denis. > > > > > > > > > > > I would encourage you creating an IEP > > > > > > > > > > That is exactly what we want to do :) > > > > > > > > > > But seems I have not sufficient privileges to do it on Ignite wiki. > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/ > Active+Proposals > > > > > > > > > > Can you or someone give me such rights? > > > > > > > > > > В Чт, 01/03/2018 в 22:23 -0800, Denis Magda пишет: > > > > > > Dmitriy R., Nilokay, > > > > > > > > > > > > Thanks for the analysis and handout of the architectural design. > > No doubts, > > > > > > it would be a valuable addition to Ignite. > > > > > > > > > > > > I would encourage you creating an IEP on the wiki and break the > > work into > > > > > > pieces discussing specific part with the community. > > > > > > > > > > > > -- > > > > > > Denis > > > > > > > > > > > > > > > > > > On Thu, Mar 1, 2018 at 9:29 PM, Nikolay Izhikov < > > nizhi...@apache.org> wrote: > > > > > > > > > > > > > Hello, Dmitriy. > > > > > > > > > > > > > > Thank you for feedback! > > > > > > > > > > > > > > > Will it be supported? > > > > > > > > > > > > > > Yes. > > > > > > > > > > > > > > TDE shouldn't broke any of existing Ignite features. > > > > > > > It adds some encrypt/decrypt level when we writing and reading > > pages > > > > > > > in/from PDS. > > > > > > > > > > > > > > В Пт, 02/03/2018 в 07:29 +0300, Dmitriy Setrakyan пишет: > > > > > > > > I have looked at the design, but could not find anything > about > > running > > > > > > > > > > > > > > SQL > > > > > > > > queries against the encrypted data. Will it be supported? > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > On Thu, Mar 1, 2018 at 8:05 PM, Nikolay Izhikov < > > nizhi...@apache.org> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hell, Dima! > > > > > > > > > > > > > > > > > > Thank you for document! > > > > > > > > > > > > > > > > > > I'm ready to implement this feature with you. > > > > > > > > > > > > > > > > > > Igniters, please, share you thoughts about proposed design > > > > > > > > > > > > > > > > > > [1] https://1drv.ms/w/s!AqZdfua4UpmuhneoVhOCiXSUBGIf > > > > > > > > > > > > > > > > > > В Чт, 01/03/2018 в 15:46 +0300, Дмитрий Рябов пишет: > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > I investigated the issue and wrote some details in a > draft > > document > > > > > > > > > > [1]. I think we should made IEP for TDE because it is a > > big change > > > > > > > > > > > > > > and > > > > > > > > > > should be described in a single place, but not in a > message > > > > > > > > > > conversation. > > > > > > > > > > Please, look it and write your thoughts. What is not > > understandable, > > > > > > > > > > what should be detailed or described? > > > > > > > > > > > > > > > > > > > > > Where are we going to store keys (MEK) physically? > Would > > it be > > > > > > > > > > > > > > PKCS#11 > > > > > > > > > > > storage? Where we will store passwords to unlock > storage > > or it > > > > > > > > > > > > > > will be > > > > > > > > > > >
Re: Data compression design proposal
Vova, thanks for comments. Anyway, page compression at rebalancing is a good idea even is we have problems with storing on disc. 2018-03-26 19:51 GMT+03:00 Vyacheslav Daradur: > Since PDS is strongly depending on memory page's size I'd like to > compress serialized data inside page exclude page header. > > On Mon, Mar 26, 2018 at 7:49 PM, Vladimir Ozerov > wrote: > > Alex, > > > > In fact there are many approaches to this. Some vendors decided stick to > > page - page is filled with data and then compressed when certain > threshold > > is reached (e.g. page is full or filled up to X%). Another approach is to > > store data in memory in *larger blocks* than on the disk, and when it > comes > > to flush, one may try to compress it. If final size is lower than disk > > block size then compression is considered successfull and data is saved > in > > compressed form. Otherwise data is saved as is. > > > > Both approaches may work, but IMO compression within a single block is > > better and simpler to implement. > > > > On Mon, Mar 26, 2018 at 6:53 PM, Alexey Goncharuk < > > alexey.goncha...@gmail.com> wrote: > > > >> Guys, > >> > >> How does this fit the PageMemory concept? Currently it assumes that the > >> size of the page in memory and the size of the page on disk is the > same, so > >> only per-entry level compression within a page makes sense. > >> > >> If you compress a whole page, how do you calculate the page offset in > the > >> target data file? > >> > >> --AG > >> > >> 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov : > >> > >> > Gents, > >> > > >> > If I understood the idea correctly, the proposal is to compress pages > on > >> > eviction and decompress them on read from disk. Is it correct? > >> > > >> > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov > wrote: > >> > > >> > > + 1 to Taras's vision. > >> > > > >> > > Compression on eviction is a good case to store more. > >> > > Pages at memory always hot a real system, so complession in memory > will > >> > > definetely slowdown the system, I think. > >> > > > >> > > Anyway, we can split issue to "on eviction compression" and to > >> "in-memory > >> > > compression". > >> > > > >> > > > >> > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > >> > > > >> > > > Hi, > >> > > > > >> > > > I guess page level compression make sense on page loading / > eviction. > >> > > > In this case we can decrease I/O operation and performance boost > can > >> be > >> > > > reached. > >> > > > What is goal for in-memory compression? Holds about 2-5x data in > >> memory > >> > > > with performance drop? > >> > > > > >> > > > Also please clarify the case with compression/decompression for > hot > >> and > >> > > > cold pages. > >> > > > Is it right for your approach: > >> > > > 1. Hot pages are always decompressed in memory because many > >> read/write > >> > > > operations touch ones. > >> > > > 2. So we can compress only cold pages. > >> > > > > >> > > > So the way is suitable when the hot data size << available RAM > size. > >> > > > > >> > > > Thoughts? > >> > > > > >> > > > > >> > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > >> > > > > >> > > >> Hi Igniters! > >> > > >> > >> > > >> I’d like to do next step in our data compression discussion [1]. > >> > > >> > >> > > >> Most Igniters vote for per-data-page compression. > >> > > >> > >> > > >> I’d like to accumulate main theses to start implementation: > >> > > >> - page will be compressed with the dictionary-based approach > >> (e.g.LZV) > >> > > >> - page will be compressed in batch mode (not on every change) > >> > > >> - page compression should been initiated by an event, for > example, a > >> > > >> page’s free space drops below 20% > >> > > >> - compression process will be under page write lock > >> > > >> > >> > > >> Vladimir Ozerov has written: > >> > > >> > >> > > >>> What we do not understand yet: > >> > > 1) Granularity of compression algorithm. > >> > > 1.1) It could be per-entry - i.e. we compress the whole entry > >> > content, > >> > > but > >> > > respect boundaries between entries. E.g.: before - > >> > [ENTRY_1][ENTRY_2], > >> > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed > to > >> > > [COMPRESSED ENTRY_1 and ENTRY_2]). > >> > > v1.2) Or it could be per-field - i.e. we compress fields, but > >> > respect > >> > > binary > >> > > object layout. First approach is simple, straightforward, and > will > >> > > give > >> > > acceptable compression rate, but we will have to compress the > >> whole > >> > > binary > >> > > object on every field access, what may ruin our SQL > performance. > >> > > Second > >> > > approach is more complex, we are not sure about it's > compression > >> > rate, > >> > > but > >> > > as BinaryObject structure is preserved, we will still have fast > >> > > constant-time
Re: Data compression design proposal
Since PDS is strongly depending on memory page's size I'd like to compress serialized data inside page exclude page header. On Mon, Mar 26, 2018 at 7:49 PM, Vladimir Ozerovwrote: > Alex, > > In fact there are many approaches to this. Some vendors decided stick to > page - page is filled with data and then compressed when certain threshold > is reached (e.g. page is full or filled up to X%). Another approach is to > store data in memory in *larger blocks* than on the disk, and when it comes > to flush, one may try to compress it. If final size is lower than disk > block size then compression is considered successfull and data is saved in > compressed form. Otherwise data is saved as is. > > Both approaches may work, but IMO compression within a single block is > better and simpler to implement. > > On Mon, Mar 26, 2018 at 6:53 PM, Alexey Goncharuk < > alexey.goncha...@gmail.com> wrote: > >> Guys, >> >> How does this fit the PageMemory concept? Currently it assumes that the >> size of the page in memory and the size of the page on disk is the same, so >> only per-entry level compression within a page makes sense. >> >> If you compress a whole page, how do you calculate the page offset in the >> target data file? >> >> --AG >> >> 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov : >> >> > Gents, >> > >> > If I understood the idea correctly, the proposal is to compress pages on >> > eviction and decompress them on read from disk. Is it correct? >> > >> > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov wrote: >> > >> > > + 1 to Taras's vision. >> > > >> > > Compression on eviction is a good case to store more. >> > > Pages at memory always hot a real system, so complession in memory will >> > > definetely slowdown the system, I think. >> > > >> > > Anyway, we can split issue to "on eviction compression" and to >> "in-memory >> > > compression". >> > > >> > > >> > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : >> > > >> > > > Hi, >> > > > >> > > > I guess page level compression make sense on page loading / eviction. >> > > > In this case we can decrease I/O operation and performance boost can >> be >> > > > reached. >> > > > What is goal for in-memory compression? Holds about 2-5x data in >> memory >> > > > with performance drop? >> > > > >> > > > Also please clarify the case with compression/decompression for hot >> and >> > > > cold pages. >> > > > Is it right for your approach: >> > > > 1. Hot pages are always decompressed in memory because many >> read/write >> > > > operations touch ones. >> > > > 2. So we can compress only cold pages. >> > > > >> > > > So the way is suitable when the hot data size << available RAM size. >> > > > >> > > > Thoughts? >> > > > >> > > > >> > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: >> > > > >> > > >> Hi Igniters! >> > > >> >> > > >> I’d like to do next step in our data compression discussion [1]. >> > > >> >> > > >> Most Igniters vote for per-data-page compression. >> > > >> >> > > >> I’d like to accumulate main theses to start implementation: >> > > >> - page will be compressed with the dictionary-based approach >> (e.g.LZV) >> > > >> - page will be compressed in batch mode (not on every change) >> > > >> - page compression should been initiated by an event, for example, a >> > > >> page’s free space drops below 20% >> > > >> - compression process will be under page write lock >> > > >> >> > > >> Vladimir Ozerov has written: >> > > >> >> > > >>> What we do not understand yet: >> > > 1) Granularity of compression algorithm. >> > > 1.1) It could be per-entry - i.e. we compress the whole entry >> > content, >> > > but >> > > respect boundaries between entries. E.g.: before - >> > [ENTRY_1][ENTRY_2], >> > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to >> > > [COMPRESSED ENTRY_1 and ENTRY_2]). >> > > v1.2) Or it could be per-field - i.e. we compress fields, but >> > respect >> > > binary >> > > object layout. First approach is simple, straightforward, and will >> > > give >> > > acceptable compression rate, but we will have to compress the >> whole >> > > binary >> > > object on every field access, what may ruin our SQL performance. >> > > Second >> > > approach is more complex, we are not sure about it's compression >> > rate, >> > > but >> > > as BinaryObject structure is preserved, we will still have fast >> > > constant-time per-field access. >> > > >> > > >>> I think there are advantages in both approaches and we will be able >> > to >> > > >> compare different approaches and algorithms after prototype >> > > >> implementation. >> > > >> >> > > >> Main approach in brief: >> > > >> 1) When page’s free space drops below 20% will be triggered >> > compression >> > > >> event >> > > >> 2) Page will be locked by write lock >> > > >> 3) Page will be passed to page’s compressor implementation
Re: Data compression design proposal
Alex, In fact there are many approaches to this. Some vendors decided stick to page - page is filled with data and then compressed when certain threshold is reached (e.g. page is full or filled up to X%). Another approach is to store data in memory in *larger blocks* than on the disk, and when it comes to flush, one may try to compress it. If final size is lower than disk block size then compression is considered successfull and data is saved in compressed form. Otherwise data is saved as is. Both approaches may work, but IMO compression within a single block is better and simpler to implement. On Mon, Mar 26, 2018 at 6:53 PM, Alexey Goncharuk < alexey.goncha...@gmail.com> wrote: > Guys, > > How does this fit the PageMemory concept? Currently it assumes that the > size of the page in memory and the size of the page on disk is the same, so > only per-entry level compression within a page makes sense. > > If you compress a whole page, how do you calculate the page offset in the > target data file? > > --AG > > 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov: > > > Gents, > > > > If I understood the idea correctly, the proposal is to compress pages on > > eviction and decompress them on read from disk. Is it correct? > > > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov wrote: > > > > > + 1 to Taras's vision. > > > > > > Compression on eviction is a good case to store more. > > > Pages at memory always hot a real system, so complession in memory will > > > definetely slowdown the system, I think. > > > > > > Anyway, we can split issue to "on eviction compression" and to > "in-memory > > > compression". > > > > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > > > Hi, > > > > > > > > I guess page level compression make sense on page loading / eviction. > > > > In this case we can decrease I/O operation and performance boost can > be > > > > reached. > > > > What is goal for in-memory compression? Holds about 2-5x data in > memory > > > > with performance drop? > > > > > > > > Also please clarify the case with compression/decompression for hot > and > > > > cold pages. > > > > Is it right for your approach: > > > > 1. Hot pages are always decompressed in memory because many > read/write > > > > operations touch ones. > > > > 2. So we can compress only cold pages. > > > > > > > > So the way is suitable when the hot data size << available RAM size. > > > > > > > > Thoughts? > > > > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > > > >> Hi Igniters! > > > >> > > > >> I’d like to do next step in our data compression discussion [1]. > > > >> > > > >> Most Igniters vote for per-data-page compression. > > > >> > > > >> I’d like to accumulate main theses to start implementation: > > > >> - page will be compressed with the dictionary-based approach > (e.g.LZV) > > > >> - page will be compressed in batch mode (not on every change) > > > >> - page compression should been initiated by an event, for example, a > > > >> page’s free space drops below 20% > > > >> - compression process will be under page write lock > > > >> > > > >> Vladimir Ozerov has written: > > > >> > > > >>> What we do not understand yet: > > > 1) Granularity of compression algorithm. > > > 1.1) It could be per-entry - i.e. we compress the whole entry > > content, > > > but > > > respect boundaries between entries. E.g.: before - > > [ENTRY_1][ENTRY_2], > > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > > v1.2) Or it could be per-field - i.e. we compress fields, but > > respect > > > binary > > > object layout. First approach is simple, straightforward, and will > > > give > > > acceptable compression rate, but we will have to compress the > whole > > > binary > > > object on every field access, what may ruin our SQL performance. > > > Second > > > approach is more complex, we are not sure about it's compression > > rate, > > > but > > > as BinaryObject structure is preserved, we will still have fast > > > constant-time per-field access. > > > > > > >>> I think there are advantages in both approaches and we will be able > > to > > > >> compare different approaches and algorithms after prototype > > > >> implementation. > > > >> > > > >> Main approach in brief: > > > >> 1) When page’s free space drops below 20% will be triggered > > compression > > > >> event > > > >> 2) Page will be locked by write lock > > > >> 3) Page will be passed to page’s compressor implementation > > > >> 4) Page will be replaced by compressed page > > > >> > > > >> Whole object or a field reading: > > > >> 1) If page marked as compressed then the page will be handled by > > > >> page’s compressor implementation, otherwise, it will be handled as > > > >> usual. > > > >> > > > >> Thoughts? > > > >> > > > >> Should we create new IEP
Re: Data compression design proposal
Hi Anton, Do you have suggestions for this approach? Sincerely, Dmitriy Pavlov пн, 26 мар. 2018 г. в 19:46, Anton Vinogradov: > Can we use another approach to store compressed pages? > > 2018-03-26 19:06 GMT+03:00 Dmitry Pavlov : > > > +1 to Alexey's concern. No reason to compress if we use previous offset > as > > pageIdx*pageSize. > > > > пн, 26 мар. 2018 г. в 18:59, Alexey Goncharuk < > alexey.goncha...@gmail.com > > >: > > > > > Guys, > > > > > > How does this fit the PageMemory concept? Currently it assumes that the > > > size of the page in memory and the size of the page on disk is the > same, > > so > > > only per-entry level compression within a page makes sense. > > > > > > If you compress a whole page, how do you calculate the page offset in > the > > > target data file? > > > > > > --AG > > > > > > 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov : > > > > > > > Gents, > > > > > > > > If I understood the idea correctly, the proposal is to compress pages > > on > > > > eviction and decompress them on read from disk. Is it correct? > > > > > > > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov > > wrote: > > > > > > > > > + 1 to Taras's vision. > > > > > > > > > > Compression on eviction is a good case to store more. > > > > > Pages at memory always hot a real system, so complession in memory > > will > > > > > definetely slowdown the system, I think. > > > > > > > > > > Anyway, we can split issue to "on eviction compression" and to > > > "in-memory > > > > > compression". > > > > > > > > > > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > > > > > > > Hi, > > > > > > > > > > > > I guess page level compression make sense on page loading / > > eviction. > > > > > > In this case we can decrease I/O operation and performance boost > > can > > > be > > > > > > reached. > > > > > > What is goal for in-memory compression? Holds about 2-5x data in > > > memory > > > > > > with performance drop? > > > > > > > > > > > > Also please clarify the case with compression/decompression for > hot > > > and > > > > > > cold pages. > > > > > > Is it right for your approach: > > > > > > 1. Hot pages are always decompressed in memory because many > > > read/write > > > > > > operations touch ones. > > > > > > 2. So we can compress only cold pages. > > > > > > > > > > > > So the way is suitable when the hot data size << available RAM > > size. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > > > > > > > >> Hi Igniters! > > > > > >> > > > > > >> I’d like to do next step in our data compression discussion [1]. > > > > > >> > > > > > >> Most Igniters vote for per-data-page compression. > > > > > >> > > > > > >> I’d like to accumulate main theses to start implementation: > > > > > >> - page will be compressed with the dictionary-based approach > > > (e.g.LZV) > > > > > >> - page will be compressed in batch mode (not on every change) > > > > > >> - page compression should been initiated by an event, for > > example, a > > > > > >> page’s free space drops below 20% > > > > > >> - compression process will be under page write lock > > > > > >> > > > > > >> Vladimir Ozerov has written: > > > > > >> > > > > > >>> What we do not understand yet: > > > > > 1) Granularity of compression algorithm. > > > > > 1.1) It could be per-entry - i.e. we compress the whole entry > > > > content, > > > > > but > > > > > respect boundaries between entries. E.g.: before - > > > > [ENTRY_1][ENTRY_2], > > > > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed > to > > > > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > > > > v1.2) Or it could be per-field - i.e. we compress fields, but > > > > respect > > > > > binary > > > > > object layout. First approach is simple, straightforward, and > > will > > > > > give > > > > > acceptable compression rate, but we will have to compress the > > > whole > > > > > binary > > > > > object on every field access, what may ruin our SQL > performance. > > > > > Second > > > > > approach is more complex, we are not sure about it's > compression > > > > rate, > > > > > but > > > > > as BinaryObject structure is preserved, we will still have > fast > > > > > constant-time per-field access. > > > > > > > > > > >>> I think there are advantages in both approaches and we will be > > able > > > > to > > > > > >> compare different approaches and algorithms after prototype > > > > > >> implementation. > > > > > >> > > > > > >> Main approach in brief: > > > > > >> 1) When page’s free space drops below 20% will be triggered > > > > compression > > > > > >> event > > > > > >> 2) Page will be locked by write lock > > > > > >> 3) Page will be passed to page’s compressor implementation > > > > > >> 4) Page will be replaced by compressed page > >
Re: Data compression design proposal
Can we use another approach to store compressed pages? 2018-03-26 19:06 GMT+03:00 Dmitry Pavlov: > +1 to Alexey's concern. No reason to compress if we use previous offset as > pageIdx*pageSize. > > пн, 26 мар. 2018 г. в 18:59, Alexey Goncharuk >: > > > Guys, > > > > How does this fit the PageMemory concept? Currently it assumes that the > > size of the page in memory and the size of the page on disk is the same, > so > > only per-entry level compression within a page makes sense. > > > > If you compress a whole page, how do you calculate the page offset in the > > target data file? > > > > --AG > > > > 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov : > > > > > Gents, > > > > > > If I understood the idea correctly, the proposal is to compress pages > on > > > eviction and decompress them on read from disk. Is it correct? > > > > > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov > wrote: > > > > > > > + 1 to Taras's vision. > > > > > > > > Compression on eviction is a good case to store more. > > > > Pages at memory always hot a real system, so complession in memory > will > > > > definetely slowdown the system, I think. > > > > > > > > Anyway, we can split issue to "on eviction compression" and to > > "in-memory > > > > compression". > > > > > > > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > > > > > Hi, > > > > > > > > > > I guess page level compression make sense on page loading / > eviction. > > > > > In this case we can decrease I/O operation and performance boost > can > > be > > > > > reached. > > > > > What is goal for in-memory compression? Holds about 2-5x data in > > memory > > > > > with performance drop? > > > > > > > > > > Also please clarify the case with compression/decompression for hot > > and > > > > > cold pages. > > > > > Is it right for your approach: > > > > > 1. Hot pages are always decompressed in memory because many > > read/write > > > > > operations touch ones. > > > > > 2. So we can compress only cold pages. > > > > > > > > > > So the way is suitable when the hot data size << available RAM > size. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > > > > > >> Hi Igniters! > > > > >> > > > > >> I’d like to do next step in our data compression discussion [1]. > > > > >> > > > > >> Most Igniters vote for per-data-page compression. > > > > >> > > > > >> I’d like to accumulate main theses to start implementation: > > > > >> - page will be compressed with the dictionary-based approach > > (e.g.LZV) > > > > >> - page will be compressed in batch mode (not on every change) > > > > >> - page compression should been initiated by an event, for > example, a > > > > >> page’s free space drops below 20% > > > > >> - compression process will be under page write lock > > > > >> > > > > >> Vladimir Ozerov has written: > > > > >> > > > > >>> What we do not understand yet: > > > > 1) Granularity of compression algorithm. > > > > 1.1) It could be per-entry - i.e. we compress the whole entry > > > content, > > > > but > > > > respect boundaries between entries. E.g.: before - > > > [ENTRY_1][ENTRY_2], > > > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > > > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > > > v1.2) Or it could be per-field - i.e. we compress fields, but > > > respect > > > > binary > > > > object layout. First approach is simple, straightforward, and > will > > > > give > > > > acceptable compression rate, but we will have to compress the > > whole > > > > binary > > > > object on every field access, what may ruin our SQL performance. > > > > Second > > > > approach is more complex, we are not sure about it's compression > > > rate, > > > > but > > > > as BinaryObject structure is preserved, we will still have fast > > > > constant-time per-field access. > > > > > > > > >>> I think there are advantages in both approaches and we will be > able > > > to > > > > >> compare different approaches and algorithms after prototype > > > > >> implementation. > > > > >> > > > > >> Main approach in brief: > > > > >> 1) When page’s free space drops below 20% will be triggered > > > compression > > > > >> event > > > > >> 2) Page will be locked by write lock > > > > >> 3) Page will be passed to page’s compressor implementation > > > > >> 4) Page will be replaced by compressed page > > > > >> > > > > >> Whole object or a field reading: > > > > >> 1) If page marked as compressed then the page will be handled by > > > > >> page’s compressor implementation, otherwise, it will be handled as > > > > >> usual. > > > > >> > > > > >> Thoughts? > > > > >> > > > > >> Should we create new IEP and register tickets to start > > implementation? > > > > >> This will allow us to watch for the feature progress and related >
[jira] [Created] (IGNITE-8051) Put operation may hang on LOCAL TRANSACTIONAL cache if it was stopped asynchronously.
Pavel Pereslegin created IGNITE-8051: Summary: Put operation may hang on LOCAL TRANSACTIONAL cache if it was stopped asynchronously. Key: IGNITE-8051 URL: https://issues.apache.org/jira/browse/IGNITE-8051 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.4 Reporter: Pavel Pereslegin Put operation may hang if local cache was destroyed asynchronously. It seems that this happens because IgniteTxImplicitSingleStateImpl#topologyReadLock does not check cache status for local cache. Simple reproducer: {code:java} public class DestroyLocalCacheTest extends GridCommonAbstractTest { /** */ public void testDestroyAsync() throws Exception { try (Ignite node = startGrid()) { IgniteCachelocCache = node.createCache( new CacheConfiguration (DEFAULT_CACHE_NAME).setCacheMode(CacheMode.LOCAL) .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)); AtomicInteger cntr = new AtomicInteger(); GridTestUtils.runMultiThreadedAsync(() -> { try { int key; while ((key = cntr.getAndIncrement()) < 10_000) { if (key == 1000) locCache.destroy(); locCache.put(key, true); } } catch (Exception ignore) { // No-op } return null; }, 5, "put-thread").get(); } } } {code} Log output: {noformat} [2018-03-26 19:10:11,350][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridCacheProcessor] Started cache [name=default, id=1544803905, memoryPolicyName=default, mode=LOCAL, atomicity=TRANSACTIONAL, backups=0] [2018-03-26 19:10:11,352][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] Finished waiting for partition release future [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], waitTime=0ms, futInfo=NA] [2018-03-26 19:10:11,353][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=1]] [2018-03-26 19:10:11,354][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], err=null] [2018-03-26 19:10:11,354][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], crd=true] [2018-03-26 19:10:11,355][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=1, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, node=fe867235-2fff-465a-b767-206abd291058] [2018-03-26 19:10:11,590][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][time] Started exchange init [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=2], crd=true, evt=DISCOVERY_CUSTOM_EVT, evtNode=fe867235-2fff-465a-b767-206abd291058, customEvt=DynamicCacheChangeBatch [id=67313136261-d40acb94-363f-4c5e-b62b-1a95fbe73164, reqs=[DynamicCacheChangeRequest [cacheName=default, hasCfg=false, nodeId=fe867235-2fff-465a-b767-206abd291058, clientStartOnly=false, stop=true, destroy=false, disabledAfterStartfalse]], exchangeActions=ExchangeActions [startCaches=null, stopCaches=[default], startGrps=[], stopGrps=[default, destroy=true], resetParts=null, stateChangeRequest=null], startCaches=false], allowMerge=false] [2018-03-26 19:10:11,590][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] Finished waiting for partition release future [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=2], waitTime=0ms, futInfo=NA] [2018-03-26 19:10:11,591][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=2], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=2]] [2018-03-26 19:10:11,592][INFO ][exchange-worker-#38%local.DestroyLocalCacheTest%][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=2], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=2], err=null] [2018-03-26 19:15:09,191][ERROR][main][root] Test has been timed out and will be interrupted (threads dump will be taken before interruption) [test=testDestroyAsync, timeout=30] [2018-03-26 19:15:09,191][WARN ][main][diagnostic] Dumping debug info for node
[jira] [Created] (IGNITE-8050) Throw a meaningful exception when user issues TX SQL keyword with MVCC turned off
Alexander Paschenko created IGNITE-8050: --- Summary: Throw a meaningful exception when user issues TX SQL keyword with MVCC turned off Key: IGNITE-8050 URL: https://issues.apache.org/jira/browse/IGNITE-8050 Project: Ignite Issue Type: Task Reporter: Alexander Paschenko Fix For: 2.5 An exception must be thrown when the user issues TX SQL command (BEGIN, COMMIT, ROLLBACK) in absence of MVCC - ingoring these may be confusing and can lead to SQL engine behavior to behaving quite differently from what the user expects, esp. in terms of data consistency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Data compression design proposal
+1 to Alexey's concern. No reason to compress if we use previous offset as pageIdx*pageSize. пн, 26 мар. 2018 г. в 18:59, Alexey Goncharuk: > Guys, > > How does this fit the PageMemory concept? Currently it assumes that the > size of the page in memory and the size of the page on disk is the same, so > only per-entry level compression within a page makes sense. > > If you compress a whole page, how do you calculate the page offset in the > target data file? > > --AG > > 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov : > > > Gents, > > > > If I understood the idea correctly, the proposal is to compress pages on > > eviction and decompress them on read from disk. Is it correct? > > > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov wrote: > > > > > + 1 to Taras's vision. > > > > > > Compression on eviction is a good case to store more. > > > Pages at memory always hot a real system, so complession in memory will > > > definetely slowdown the system, I think. > > > > > > Anyway, we can split issue to "on eviction compression" and to > "in-memory > > > compression". > > > > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > > > Hi, > > > > > > > > I guess page level compression make sense on page loading / eviction. > > > > In this case we can decrease I/O operation and performance boost can > be > > > > reached. > > > > What is goal for in-memory compression? Holds about 2-5x data in > memory > > > > with performance drop? > > > > > > > > Also please clarify the case with compression/decompression for hot > and > > > > cold pages. > > > > Is it right for your approach: > > > > 1. Hot pages are always decompressed in memory because many > read/write > > > > operations touch ones. > > > > 2. So we can compress only cold pages. > > > > > > > > So the way is suitable when the hot data size << available RAM size. > > > > > > > > Thoughts? > > > > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > > > >> Hi Igniters! > > > >> > > > >> I’d like to do next step in our data compression discussion [1]. > > > >> > > > >> Most Igniters vote for per-data-page compression. > > > >> > > > >> I’d like to accumulate main theses to start implementation: > > > >> - page will be compressed with the dictionary-based approach > (e.g.LZV) > > > >> - page will be compressed in batch mode (not on every change) > > > >> - page compression should been initiated by an event, for example, a > > > >> page’s free space drops below 20% > > > >> - compression process will be under page write lock > > > >> > > > >> Vladimir Ozerov has written: > > > >> > > > >>> What we do not understand yet: > > > 1) Granularity of compression algorithm. > > > 1.1) It could be per-entry - i.e. we compress the whole entry > > content, > > > but > > > respect boundaries between entries. E.g.: before - > > [ENTRY_1][ENTRY_2], > > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > > v1.2) Or it could be per-field - i.e. we compress fields, but > > respect > > > binary > > > object layout. First approach is simple, straightforward, and will > > > give > > > acceptable compression rate, but we will have to compress the > whole > > > binary > > > object on every field access, what may ruin our SQL performance. > > > Second > > > approach is more complex, we are not sure about it's compression > > rate, > > > but > > > as BinaryObject structure is preserved, we will still have fast > > > constant-time per-field access. > > > > > > >>> I think there are advantages in both approaches and we will be able > > to > > > >> compare different approaches and algorithms after prototype > > > >> implementation. > > > >> > > > >> Main approach in brief: > > > >> 1) When page’s free space drops below 20% will be triggered > > compression > > > >> event > > > >> 2) Page will be locked by write lock > > > >> 3) Page will be passed to page’s compressor implementation > > > >> 4) Page will be replaced by compressed page > > > >> > > > >> Whole object or a field reading: > > > >> 1) If page marked as compressed then the page will be handled by > > > >> page’s compressor implementation, otherwise, it will be handled as > > > >> usual. > > > >> > > > >> Thoughts? > > > >> > > > >> Should we create new IEP and register tickets to start > implementation? > > > >> This will allow us to watch for the feature progress and related > > > >> tasks. > > > >> > > > >> > > > >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data- > > > >> compression-in-Ignite-tc20679.html > > > >> > > > >> > > > >> > > > > -- > > > > Taras Ledkov > > > > Mail-To: tled...@gridgain.com > > > > > > > > > > > > > >
Re: Data compression design proposal
Guys, How does this fit the PageMemory concept? Currently it assumes that the size of the page in memory and the size of the page on disk is the same, so only per-entry level compression within a page makes sense. If you compress a whole page, how do you calculate the page offset in the target data file? --AG 2018-03-26 17:39 GMT+03:00 Vladimir Ozerov: > Gents, > > If I understood the idea correctly, the proposal is to compress pages on > eviction and decompress them on read from disk. Is it correct? > > On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradov wrote: > > > + 1 to Taras's vision. > > > > Compression on eviction is a good case to store more. > > Pages at memory always hot a real system, so complession in memory will > > definetely slowdown the system, I think. > > > > Anyway, we can split issue to "on eviction compression" and to "in-memory > > compression". > > > > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > > > Hi, > > > > > > I guess page level compression make sense on page loading / eviction. > > > In this case we can decrease I/O operation and performance boost can be > > > reached. > > > What is goal for in-memory compression? Holds about 2-5x data in memory > > > with performance drop? > > > > > > Also please clarify the case with compression/decompression for hot and > > > cold pages. > > > Is it right for your approach: > > > 1. Hot pages are always decompressed in memory because many read/write > > > operations touch ones. > > > 2. So we can compress only cold pages. > > > > > > So the way is suitable when the hot data size << available RAM size. > > > > > > Thoughts? > > > > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > > > >> Hi Igniters! > > >> > > >> I’d like to do next step in our data compression discussion [1]. > > >> > > >> Most Igniters vote for per-data-page compression. > > >> > > >> I’d like to accumulate main theses to start implementation: > > >> - page will be compressed with the dictionary-based approach (e.g.LZV) > > >> - page will be compressed in batch mode (not on every change) > > >> - page compression should been initiated by an event, for example, a > > >> page’s free space drops below 20% > > >> - compression process will be under page write lock > > >> > > >> Vladimir Ozerov has written: > > >> > > >>> What we do not understand yet: > > 1) Granularity of compression algorithm. > > 1.1) It could be per-entry - i.e. we compress the whole entry > content, > > but > > respect boundaries between entries. E.g.: before - > [ENTRY_1][ENTRY_2], > > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > > [COMPRESSED ENTRY_1 and ENTRY_2]). > > v1.2) Or it could be per-field - i.e. we compress fields, but > respect > > binary > > object layout. First approach is simple, straightforward, and will > > give > > acceptable compression rate, but we will have to compress the whole > > binary > > object on every field access, what may ruin our SQL performance. > > Second > > approach is more complex, we are not sure about it's compression > rate, > > but > > as BinaryObject structure is preserved, we will still have fast > > constant-time per-field access. > > > > >>> I think there are advantages in both approaches and we will be able > to > > >> compare different approaches and algorithms after prototype > > >> implementation. > > >> > > >> Main approach in brief: > > >> 1) When page’s free space drops below 20% will be triggered > compression > > >> event > > >> 2) Page will be locked by write lock > > >> 3) Page will be passed to page’s compressor implementation > > >> 4) Page will be replaced by compressed page > > >> > > >> Whole object or a field reading: > > >> 1) If page marked as compressed then the page will be handled by > > >> page’s compressor implementation, otherwise, it will be handled as > > >> usual. > > >> > > >> Thoughts? > > >> > > >> Should we create new IEP and register tickets to start implementation? > > >> This will allow us to watch for the feature progress and related > > >> tasks. > > >> > > >> > > >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data- > > >> compression-in-Ignite-tc20679.html > > >> > > >> > > >> > > > -- > > > Taras Ledkov > > > Mail-To: tled...@gridgain.com > > > > > > > > >
[jira] [Created] (IGNITE-8049) Limit the number of operation cycles in B+Tree
Alexey Goncharuk created IGNITE-8049: Summary: Limit the number of operation cycles in B+Tree Key: IGNITE-8049 URL: https://issues.apache.org/jira/browse/IGNITE-8049 Project: Ignite Issue Type: Bug Affects Versions: 2.4 Reporter: Alexey Goncharuk Fix For: 2.5 When a tree is corrupted, a B+Tree operation may result in an infinite loop. We should limit the number of retries and fail if this limit is exceeded. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8048) Dynamic indexes are not stored to cache data on node join
Alexey Goncharuk created IGNITE-8048: Summary: Dynamic indexes are not stored to cache data on node join Key: IGNITE-8048 URL: https://issues.apache.org/jira/browse/IGNITE-8048 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.4 Reporter: Alexey Goncharuk Fix For: 2.5 Consider the following scenario: 1) Start nodes, add some data 2) Shutdown a node, create a dynamic index 3) Shutdown the whole cluster, startup with the absent node, activate from the absent node 4) Since the absent node did not 'see' the create index, index will not be active after cluster activation 5) Update some data in the cluster 6) Restart the cluster, but activate from the node which did 'see' the create index 7) Attempt to update data. Depending on the updates in (5), this will either hang or result in an exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Data compression design proposal
Gents, If I understood the idea correctly, the proposal is to compress pages on eviction and decompress them on read from disk. Is it correct? On Mon, Mar 26, 2018 at 5:13 PM, Anton Vinogradovwrote: > + 1 to Taras's vision. > > Compression on eviction is a good case to store more. > Pages at memory always hot a real system, so complession in memory will > definetely slowdown the system, I think. > > Anyway, we can split issue to "on eviction compression" and to "in-memory > compression". > > > 2018-03-06 12:14 GMT+03:00 Taras Ledkov : > > > Hi, > > > > I guess page level compression make sense on page loading / eviction. > > In this case we can decrease I/O operation and performance boost can be > > reached. > > What is goal for in-memory compression? Holds about 2-5x data in memory > > with performance drop? > > > > Also please clarify the case with compression/decompression for hot and > > cold pages. > > Is it right for your approach: > > 1. Hot pages are always decompressed in memory because many read/write > > operations touch ones. > > 2. So we can compress only cold pages. > > > > So the way is suitable when the hot data size << available RAM size. > > > > Thoughts? > > > > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > > > >> Hi Igniters! > >> > >> I’d like to do next step in our data compression discussion [1]. > >> > >> Most Igniters vote for per-data-page compression. > >> > >> I’d like to accumulate main theses to start implementation: > >> - page will be compressed with the dictionary-based approach (e.g.LZV) > >> - page will be compressed in batch mode (not on every change) > >> - page compression should been initiated by an event, for example, a > >> page’s free space drops below 20% > >> - compression process will be under page write lock > >> > >> Vladimir Ozerov has written: > >> > >>> What we do not understand yet: > 1) Granularity of compression algorithm. > 1.1) It could be per-entry - i.e. we compress the whole entry content, > but > respect boundaries between entries. E.g.: before - [ENTRY_1][ENTRY_2], > after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to > [COMPRESSED ENTRY_1 and ENTRY_2]). > v1.2) Or it could be per-field - i.e. we compress fields, but respect > binary > object layout. First approach is simple, straightforward, and will > give > acceptable compression rate, but we will have to compress the whole > binary > object on every field access, what may ruin our SQL performance. > Second > approach is more complex, we are not sure about it's compression rate, > but > as BinaryObject structure is preserved, we will still have fast > constant-time per-field access. > > >>> I think there are advantages in both approaches and we will be able to > >> compare different approaches and algorithms after prototype > >> implementation. > >> > >> Main approach in brief: > >> 1) When page’s free space drops below 20% will be triggered compression > >> event > >> 2) Page will be locked by write lock > >> 3) Page will be passed to page’s compressor implementation > >> 4) Page will be replaced by compressed page > >> > >> Whole object or a field reading: > >> 1) If page marked as compressed then the page will be handled by > >> page’s compressor implementation, otherwise, it will be handled as > >> usual. > >> > >> Thoughts? > >> > >> Should we create new IEP and register tickets to start implementation? > >> This will allow us to watch for the feature progress and related > >> tasks. > >> > >> > >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data- > >> compression-in-Ignite-tc20679.html > >> > >> > >> > > -- > > Taras Ledkov > > Mail-To: tled...@gridgain.com > > > > >
[jira] [Created] (IGNITE-8047) SQL: optimize simple COUNT with DISTINCT query.
Andrew Mashenkov created IGNITE-8047: Summary: SQL: optimize simple COUNT with DISTINCT query. Key: IGNITE-8047 URL: https://issues.apache.org/jira/browse/IGNITE-8047 Project: Ignite Issue Type: Bug Components: sql Reporter: Andrew Mashenkov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8046) New flaky tests added in JettyRestProcessorAuthenticationSelfTest
Dmitriy Pavlov created IGNITE-8046: -- Summary: New flaky tests added in JettyRestProcessorAuthenticationSelfTest Key: IGNITE-8046 URL: https://issues.apache.org/jira/browse/IGNITE-8046 Project: Ignite Issue Type: Test Reporter: Dmitriy Pavlov Assignee: Alexey Kuznetsov Latest examle of failure https://ci.ignite.apache.org/viewLog.html?buildId=1159028 Test runs count is low < 10, probably new test introduced IgniteClientTestSuite: - JettyRestProcessorAuthenticationSelfTest.testAddWithExpiration (fail rate 100,0%) - JettyRestProcessorAuthenticationSelfTest.testPutWithExpiration (fail rate 100,0%) - JettyRestProcessorAuthenticationSelfTest.testReplaceWithExpiration (fail rate 100,0%) JettyRestProcessorAuthenticationSelfTest https://github.com/apache/ignite/commit/921f0cf9b3ab6454339fa57b929093f56372c61e -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Rebalancing - how to make it faster
Dmitry, It is impossible to disable WAL only for certain partitions without completely overhauling design of Ignite storage mechanism. Right now we can afford only to change WAL mode per cache group. The idea is to disable WAL when node doesn't have any partition in OWNING state, which means it doesn't have any consistent data and won't be able to restore from WAL anyway. I don't see any potential use for WAL on such node, but we can keep a configurable parameter indicating can we automatically disable WAL in such case or not. On Fri, Mar 23, 2018 at 10:40 PM, Dmitry Pavlovwrote: > Denis, as I understood, there is and idea to exclude only rebalanced > partition(s) data. All other data will go to the WAL. > > Ilya, please correct me if I'm wrong. > > пт, 23 мар. 2018 г. в 22:15, Denis Magda : > > > Ilya, > > > > That's a decent boost (5-20%) even having WAL enabled. Not sure that we > > should stake on the WAL "off" mode here because if the whole cluster goes > > down, it's then the data consistency is questionable. As an architect, I > > wouldn't disable WAL for the sake of rebalancing; it's too risky. > > > > If you agree, then let's create the IEP. This way it will be easier to > > track this endeavor. BTW, are you already ready to release any > > optimizations in 2.5 that is being discussed in a separate thread? > > > > -- > > Denis > > > > > > > > On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh > > wrote: > > > > > Denis, > > > > > > > - Don't you want to aggregate the tickets under an IEP? > > > Yes, I think so. > > > > > > > - Does it mean we're going to update our B+Tree implementation? Any > > ideas > > > how risky it is? > > > One of tickets that I created ( > > > https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree > > > modification, but I am not planning to do it in the nearest future. It > > > shouldn't affect existing tree operations, only introduce new ones > > (putAll, > > > invokeAll, removeAll). > > > > > > > - Any chance you had a prototype that shows performance optimizations > > the > > > approach you are suggesting to take? > > > I have a prototype for simplest improvements ( > https://issues.apache.org/ > > > jira/browse/IGNITE-8019 & https://issues.apache.org/ > > > jira/browse/IGNITE-8018) > > > - together they increase throughput by 5-20%, depending on > configuration > > > and environment. Also, I've tested different WAL modes - switching from > > > LOG_ONLY to NONE gives over 100% boost - this is what I expect from > > > https://issues.apache.org/jira/browse/IGNITE-8017. > > > > > > On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda > wrote: > > > > > > > Ilya, > > > > > > > > That's outstanding research and summary. Thanks for spending your > time > > on > > > > this. > > > > > > > > Not sure I have enough expertise to challenge your approach, but it > > > sounds > > > > 100% reasonable to me. As side notes: > > > > > > > >- Don't you want to aggregate the tickets under an IEP? > > > >- Does it mean we're going to update our B+Tree implementation? > Any > > > >ideas how risky it is? > > > >- Any chance you had a prototype that shows performance > > optimizations > > > of > > > >the approach you are suggesting to take? > > > > > > > > -- > > > > Denis > > > > > > > > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh > > > > > wrote: > > > > > > > > > Igniters, > > > > > > > > > > I've spent some time analyzing performance of rebalancing process. > > The > > > > > initial goal was to understand, what limits it's throughput, > because > > it > > > > is > > > > > significantly slower than network and storage device can > > theoretically > > > > > handle. > > > > > > > > > > Turns out, our current implementation has a number of issues caused > > by > > > a > > > > > single fundamental problem. > > > > > > > > > > During rebalance data is sent in batches called > > > > > GridDhtPartitionSupplyMessages. Batch size is configurable, > default > > > > value > > > > > is 512KB, which could mean thousands of key-value pairs. However, > we > > > > don't > > > > > take any advantage over this fact and process each entry > > independently: > > > > > - checkpointReadLock is acquired multiple times for every entry, > > > leading > > > > to > > > > > unnecessary contention - this is clearly a bug; > > > > > - for each entry we write (and fsync, if configuration assumes it) > a > > > > > separate WAL record - so, if batch contains N entries, we might end > > up > > > > > doing N fsyncs; > > > > > - adding every entry into CacheDataStore also happens completely > > > > > independently. It means, we will traverse and modify each index > tree > > N > > > > > times, we will allocate space in FreeList N times and we will have > to > > > > > additionally store in WAL O(N*log(N)) page delta records. > > > > > > > > > > I've created a few tickets in JIRA with very
[jira] [Created] (IGNITE-8045) Direct-IO component is not published into maven repository.
Vyacheslav Koptilin created IGNITE-8045: --- Summary: Direct-IO component is not published into maven repository. Key: IGNITE-8045 URL: https://issues.apache.org/jira/browse/IGNITE-8045 Project: Ignite Issue Type: Bug Components: build Affects Versions: 2.4 Reporter: Vyacheslav Koptilin Assignee: Vyacheslav Koptilin Direct-IO component is not published into maven repository. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Rebalancing - how to make it faster
+1 for IEP creation 2018-03-23 22:40 GMT+03:00 Dmitry Pavlov: > Denis, as I understood, there is and idea to exclude only rebalanced > partition(s) data. All other data will go to the WAL. > > Ilya, please correct me if I'm wrong. > > пт, 23 мар. 2018 г. в 22:15, Denis Magda : > > > Ilya, > > > > That's a decent boost (5-20%) even having WAL enabled. Not sure that we > > should stake on the WAL "off" mode here because if the whole cluster goes > > down, it's then the data consistency is questionable. As an architect, I > > wouldn't disable WAL for the sake of rebalancing; it's too risky. > > > > If you agree, then let's create the IEP. This way it will be easier to > > track this endeavor. BTW, are you already ready to release any > > optimizations in 2.5 that is being discussed in a separate thread? > > > > -- > > Denis > > > > > > > > On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh > > wrote: > > > > > Denis, > > > > > > > - Don't you want to aggregate the tickets under an IEP? > > > Yes, I think so. > > > > > > > - Does it mean we're going to update our B+Tree implementation? Any > > ideas > > > how risky it is? > > > One of tickets that I created ( > > > https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree > > > modification, but I am not planning to do it in the nearest future. It > > > shouldn't affect existing tree operations, only introduce new ones > > (putAll, > > > invokeAll, removeAll). > > > > > > > - Any chance you had a prototype that shows performance optimizations > > the > > > approach you are suggesting to take? > > > I have a prototype for simplest improvements ( > https://issues.apache.org/ > > > jira/browse/IGNITE-8019 & https://issues.apache.org/ > > > jira/browse/IGNITE-8018) > > > - together they increase throughput by 5-20%, depending on > configuration > > > and environment. Also, I've tested different WAL modes - switching from > > > LOG_ONLY to NONE gives over 100% boost - this is what I expect from > > > https://issues.apache.org/jira/browse/IGNITE-8017. > > > > > > On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda > wrote: > > > > > > > Ilya, > > > > > > > > That's outstanding research and summary. Thanks for spending your > time > > on > > > > this. > > > > > > > > Not sure I have enough expertise to challenge your approach, but it > > > sounds > > > > 100% reasonable to me. As side notes: > > > > > > > >- Don't you want to aggregate the tickets under an IEP? > > > >- Does it mean we're going to update our B+Tree implementation? > Any > > > >ideas how risky it is? > > > >- Any chance you had a prototype that shows performance > > optimizations > > > of > > > >the approach you are suggesting to take? > > > > > > > > -- > > > > Denis > > > > > > > > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh > > > > > wrote: > > > > > > > > > Igniters, > > > > > > > > > > I've spent some time analyzing performance of rebalancing process. > > The > > > > > initial goal was to understand, what limits it's throughput, > because > > it > > > > is > > > > > significantly slower than network and storage device can > > theoretically > > > > > handle. > > > > > > > > > > Turns out, our current implementation has a number of issues caused > > by > > > a > > > > > single fundamental problem. > > > > > > > > > > During rebalance data is sent in batches called > > > > > GridDhtPartitionSupplyMessages. Batch size is configurable, > default > > > > value > > > > > is 512KB, which could mean thousands of key-value pairs. However, > we > > > > don't > > > > > take any advantage over this fact and process each entry > > independently: > > > > > - checkpointReadLock is acquired multiple times for every entry, > > > leading > > > > to > > > > > unnecessary contention - this is clearly a bug; > > > > > - for each entry we write (and fsync, if configuration assumes it) > a > > > > > separate WAL record - so, if batch contains N entries, we might end > > up > > > > > doing N fsyncs; > > > > > - adding every entry into CacheDataStore also happens completely > > > > > independently. It means, we will traverse and modify each index > tree > > N > > > > > times, we will allocate space in FreeList N times and we will have > to > > > > > additionally store in WAL O(N*log(N)) page delta records. > > > > > > > > > > I've created a few tickets in JIRA with very different levels of > > scale > > > > and > > > > > complexity. > > > > > > > > > > Ways to reduce impact of independent processing: > > > > > - https://issues.apache.org/jira/browse/IGNITE-8019 - > aforementioned > > > > bug, > > > > > causing contention on checkpointReadLock; > > > > > - https://issues.apache.org/jira/browse/IGNITE-8018 - inefficiency > > in > > > > > GridCacheMapEntry implementation; > > > > > - https://issues.apache.org/jira/browse/IGNITE-8017 - > automatically > > > > > disable > > > > > WAL during
[GitHub] ignite pull request #3699: IGNITE-6842: add default behavior by stopping all...
Github user Mmuzaf closed the pull request at: https://github.com/apache/ignite/pull/3699 ---
Re: Data compression design proposal
+ 1 to Taras's vision. Compression on eviction is a good case to store more. Pages at memory always hot a real system, so complession in memory will definetely slowdown the system, I think. Anyway, we can split issue to "on eviction compression" and to "in-memory compression". 2018-03-06 12:14 GMT+03:00 Taras Ledkov: > Hi, > > I guess page level compression make sense on page loading / eviction. > In this case we can decrease I/O operation and performance boost can be > reached. > What is goal for in-memory compression? Holds about 2-5x data in memory > with performance drop? > > Also please clarify the case with compression/decompression for hot and > cold pages. > Is it right for your approach: > 1. Hot pages are always decompressed in memory because many read/write > operations touch ones. > 2. So we can compress only cold pages. > > So the way is suitable when the hot data size << available RAM size. > > Thoughts? > > > On 05.03.2018 20:18, Vyacheslav Daradur wrote: > >> Hi Igniters! >> >> I’d like to do next step in our data compression discussion [1]. >> >> Most Igniters vote for per-data-page compression. >> >> I’d like to accumulate main theses to start implementation: >> - page will be compressed with the dictionary-based approach (e.g.LZV) >> - page will be compressed in batch mode (not on every change) >> - page compression should been initiated by an event, for example, a >> page’s free space drops below 20% >> - compression process will be under page write lock >> >> Vladimir Ozerov has written: >> >>> What we do not understand yet: 1) Granularity of compression algorithm. 1.1) It could be per-entry - i.e. we compress the whole entry content, but respect boundaries between entries. E.g.: before - [ENTRY_1][ENTRY_2], after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to [COMPRESSED ENTRY_1 and ENTRY_2]). v1.2) Or it could be per-field - i.e. we compress fields, but respect binary object layout. First approach is simple, straightforward, and will give acceptable compression rate, but we will have to compress the whole binary object on every field access, what may ruin our SQL performance. Second approach is more complex, we are not sure about it's compression rate, but as BinaryObject structure is preserved, we will still have fast constant-time per-field access. >>> I think there are advantages in both approaches and we will be able to >> compare different approaches and algorithms after prototype >> implementation. >> >> Main approach in brief: >> 1) When page’s free space drops below 20% will be triggered compression >> event >> 2) Page will be locked by write lock >> 3) Page will be passed to page’s compressor implementation >> 4) Page will be replaced by compressed page >> >> Whole object or a field reading: >> 1) If page marked as compressed then the page will be handled by >> page’s compressor implementation, otherwise, it will be handled as >> usual. >> >> Thoughts? >> >> Should we create new IEP and register tickets to start implementation? >> This will allow us to watch for the feature progress and related >> tasks. >> >> >> [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data- >> compression-in-Ignite-tc20679.html >> >> >> > -- > Taras Ledkov > Mail-To: tled...@gridgain.com > >
[GitHub] ignite pull request #3690: IGNITE-7928 .NET: Exception is not propagated to ...
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/3690 ---
[GitHub] ignite pull request #3693: IGNITE-8036 MulticastIpFinder is replaced with Vm...
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/3693 ---
[GitHub] ignite pull request #3700: IGNITE-6839 Ignite Compatibility: flaky test test...
GitHub user daradurvs opened a pull request: https://github.com/apache/ignite/pull/3700 IGNITE-6839 Ignite Compatibility: flaky test testFoldersReuseCompatibility_2_1 & 2_2 & 2_3 â¦.util.IgniteUtils.classLoaderUrls(Ljava/lang/ClassLoader;)[Ljava/net/URL;" was fixed; code cleanup You can merge this pull request into a Git repository by running: $ git pull https://github.com/daradurvs/ignite ignite-6839 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3700.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3700 commit 4a63069e026537fbecc4d33aca80a3ade25dbfd9 Author: Vyacheslav DaradurDate: 2018-03-26T13:31:40Z ignite-6839: "java.lang.NoSuchMethodError: org.apache.ignite.internal.util.IgniteUtils.classLoaderUrls(Ljava/lang/ClassLoader;)[Ljava/net/URL;" was fixed; code cleanup ---
PR#3697 IGNITE-8021: Delete cache config files when destroyed
Hi Team, Could you review this PR https://github.com/apache/ignite/pull/3697 please? Tests result: https://ci.ignite.apache.org/viewLog.html?buildId=1159660 Jira ticket: https://issues.apache.org/jira/browse/IGNITE-8021 -- Sincerely yours, Ivan Daschinskiy
Re: IEP-14: Ignite failures handling (Discussion)
Yakov, I agree with Andrey that a separate abstraction for failure handling makes sense. First, using event listeners for this kind of response allows users to install multiple listeners, which may be invoked in an unpredictable order, this looks error-prone to me. Second, we may add an additional methods to failure handlers in future releases (say, Ignite 3.00), so it is better to have a separate interface right away. I do not mind adding a separate event for this, though, but the event should be used for notifications, not to run any reaction code. --AG 2018-03-23 22:27 GMT+03:00 Yakov Zhdanov: > Andrey, I understand your point but you are trying to build one more > mechanism and introduce abstractions that are already here. Again, please > take a look at segmentation policy and event types we already have. > > Thanks! > > Yakov >
[jira] [Created] (IGNITE-8044) IgniteQueryGenerator.getOptions() method should properly handle empty list of parameters.
Vyacheslav Koptilin created IGNITE-8044: --- Summary: IgniteQueryGenerator.getOptions() method should properly handle empty list of parameters. Key: IGNITE-8044 URL: https://issues.apache.org/jira/browse/IGNITE-8044 Project: Ignite Issue Type: Bug Components: general Affects Versions: 2.4 Reporter: Vyacheslav Koptilin {code}IgniteQueryGenerator.getOptions(){code} method does not check that the parameters list may be {code}null{code} Initial disscussion of the issue is available on the user-list: http://apache-ignite-users.70518.x6.nabble.com/ArrayIndexOutOfBoundsException-1-in-IgniteQueryGenerator-getOptions-tt20801.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Optimize GridLongList serialization
I investigated network loading and found that a big part of internal data inside messages is `GridLongList`. It is a part of `GridDhtTxFinishRequest`, `GridDhtAtomicDeferredUpdateResponse`, `GridDhtAtomicUpdateRequest`, `GridNearAtomicFullUpdateRequest` and `NearCacheUpdates`. So I think it has the sense to optimize `GridLongList` serialization. Here we serialize all elements and don't take into account `idx` value: ``` @Override public boolean writeTo(ByteBuffer buf, MessageWriter writer) { writer.setBuffer(buf); if (!writer.isHeaderWritten()) { if (!writer.writeHeader(directType(), fieldsCount())) return false; writer.onHeaderWritten(); } switch (writer.state()) { case 0: if (!writer.writeLongArray("arr", arr)) return false; writer.incrementState(); case 1: if (!writer.writeInt("idx", idx)) return false; writer.incrementState(); } return true; } ``` Which is not happening in another serialization method in the same class: ``` public static void writeTo(DataOutput out, @Nullable GridLongList list) throws IOException { out.writeInt(list != null ? list.idx : -1); if (list != null) { for (int i = 0; i < list.idx; i++) out.writeLong(list.arr[i]); } } ``` So, we can simply reduce messages size by sending only a valuable part of the array. If you don't mind I will create an issue in Jira for this. By the way, `long` is a huge type. As I see in most cases `GridLongList` uses for counters. And I have checked the possibility of compress `long` into smaller types as `int`, `short` or `byte` in test `GridCacheInterceptorAtomicRebalanceTest` (took it by random). And found out that all `long` in`GridLongList` can be cast to `int` and 70% of them to shorts. Such conversion is quite fast about 1.1 (ns) per element (I have checked it by JMH test). Of course, there are a lot of ways to compress data, but I know proprietary GridGain plug-in has different `MessageWriter` implementation. So maybe it is unnecessary and some compression already exists in this proprietary plug-in. Does someone know something about it?
[jira] [Created] (IGNITE-8043) Thin client fails while connecting to Ignite instance in docker container
Jiri Tobisek created IGNITE-8043: Summary: Thin client fails while connecting to Ignite instance in docker container Key: IGNITE-8043 URL: https://issues.apache.org/jira/browse/IGNITE-8043 Project: Ignite Issue Type: Bug Components: jdbc Affects Versions: 2.4, 2.3, 2.5 Environment: JDK8. Happens both in local dev environment (Mac OSX) and remote Ubuntu 16 server. Docker 17.12.0-ce Reporter: Jiri Tobisek I am running ignite inside docker: {{docker run -it -p 11211:11211 apacheignite/ignite:2.4.0}} While trying to connect via the thin client (in scala): {{val connection: Connection = DriverManager.getConnection(s"""jdbc:ignite:thin://localhost:11211/""")}} I am getting: {{Failed to connect to Ignite cluster [host=localhost, port=11211]}} {{java.sql.SQLException: Failed to connect to Ignite cluster [host=localhost, port=11211]}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.(JdbcThinConnection.java:151)}} {{ at org.apache.ignite.IgniteJdbcThinDriver.connect(IgniteJdbcThinDriver.java:170)}} {{ at java.sql.DriverManager.getConnection(DriverManager.java:664)}} {{ at java.sql.DriverManager.getConnection(DriverManager.java:270)}} {{...}} {{Caused by: java.io.IOException: Failed to read incoming message (not enough data).}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.read(JdbcThinTcpIo.java:406)}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.read(JdbcThinTcpIo.java:384)}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.handshake(JdbcThinTcpIo.java:223)}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.start(JdbcThinTcpIo.java:191)}} {{ at org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.(JdbcThinConnection.java:146)}} {{ ... 52 more}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration
I had implement user credential store according with the previous discussion about user authentication. Now JDBC thin, and ODBC support user authentication. We haven't implemented it for all thin client because protocols are not similar. I see two ways to implement authentication for thin client protocol: - You implement authentication on server side and at the .NET; - When java thin client [1] is merged I implement authentication for thin protocol & for java thin client. I'll add documentation for user authentication ASAP. Please feel free to contact if you need more info till documentation is added. [1]. https://issues.apache.org/jira/browse/IGNITE-7421 On 26.03.2018 9:56, Pavel Tupitsyn wrote: I've started this task, and the property name combined with lack of javadoc seems confusing and misleading: * Turns out this authentication is only for thin clients * Not clear how to configure and use it, even after digging through Jira and devlist How do I write test to ensure it works? Thanks, Pavel On Fri, Mar 23, 2018 at 6:44 PM, Pavel Tupitsynwrote: Thanks, got it, will do. On Fri, Mar 23, 2018 at 4:36 PM, Dmitry Pavlov wrote: Hi Pavel, Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436 Sincerely, Dmitriy Pavlov пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn : Please provide description in IGNITE-8034 and link Java-side ticket there. On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn wrote: Hi Vladimir, Can you provide more details? * What does it do? * Do we need to only propagate the flag to .NET or do anything else? * Related ticket? Thanks, Pavel On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov < voze...@gridgain.com> wrote: Pavel, We introduced new flag IgniteConfiguration.authenticationEnabled recently. Would you mind adding it to IgniteConfigutation.cs [1]? Vladimir. [1] https://issues.apache.org/jira/browse/IGNITE-8034 -- Taras Ledkov Mail-To: tled...@gridgain.com
[GitHub] ignite pull request #3699: IGNITE-6842: add default behavior by stopping all...
GitHub user Mmuzaf opened a pull request: https://github.com/apache/ignite/pull/3699 IGNITE-6842: add default behavior by stopping all instances for tests 1. Stop all grids by default in aftreTestsStop method; 2. Throw exception in case stopping process fails; You can merge this pull request into a Git repository by running: $ git pull https://github.com/Mmuzaf/ignite ignite-6842_2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3699.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3699 commit abeab5dafd90f385e8eaec57e561eb52570b9136 Author: Maxim MuzafarovDate: 2018-03-26T12:10:24Z IGNITE-6842: add default behavior by stopping all instances for tests ---
[GitHub] ignite pull request #3665: ignite-8004 set compatibilityMode to true in coll...
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/3665 ---
[GitHub] ignite pull request #3698: Ignite 2.4.4 .net
Github user devozerov closed the pull request at: https://github.com/apache/ignite/pull/3698 ---
[GitHub] ignite pull request #3671: IGNITE-7852: Supported username/password authenti...
Github user asfgit closed the pull request at: https://github.com/apache/ignite/pull/3671 ---
[GitHub] ignite pull request #3698: Ignite 2.4.4 .net
GitHub user devozerov opened a pull request: https://github.com/apache/ignite/pull/3698 Ignite 2.4.4 .net You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-2.4.4-.net Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3698.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3698 ---
[GitHub] ignite pull request #3697: IGNITE-8021 Delete cache config files when destro...
GitHub user ivandasch opened a pull request: https://github.com/apache/ignite/pull/3697 IGNITE-8021 Delete cache config files when destroyed You can merge this pull request into a Git repository by running: $ git pull https://github.com/ivandasch/ignite ignite-8021 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3697.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3697 commit dd4b0d825c6d801084a3cb2e1ff02a24a1a75718 Author: Ivan DaschinskiyDate: 2018-03-26T09:03:00Z IGNITE-8021 Delete stored file with cache configuration data when cacheor group is destroyed. ---
Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Val, There's no any sense to use WalMode.NONE in production environment, it's kept for testing and debugging purposes (including possible user activities like capacity planning). We already print a warning at node start in case WalMode.NONE is set: U.quietAndWarn(log,"Started write-ahead log manager in NONE mode, persisted data may be lost in " + "a case of unexpected node failure. Make sure to deactivate the cluster before shutdown."); Best Regards, Ivan Rakov On 24.03.2018 1:40, Valentin Kulichenko wrote: Dmitry, Thanks for clarification. So it sounds like if we fix all other modes as we discuss here, NONE would be the only one allowing corruption. I also don't see much sense in this and I think we should clearly state this in the doc, as well print out a warning if NONE mode is used. Eventually, if it's confirmed that there are no reasonable use cases for it, we can deprecate it. -Val On Fri, Mar 23, 2018 at 3:26 PM, Dmitry Pavlovwrote: Hi Val, NONE means that the WAL log is disabled and not written at all. Use of the mode is at your own risk. It is possible that restore state after the crash at the middle of checkpoint will not succeed. I do not see much sence in it, especially in production. BACKGROUND is full functional WAL mode, but allows some delay before flush to disk. Sincerely, Dmitriy Pavlov сб, 24 мар. 2018 г. в 1:07, Valentin Kulichenko < valentin.kuliche...@gmail.com>: I agree. In my view, any possibility to get a corrupted storage is a bug which needs to be fixed. BTW, can someone explain semantics of NONE mode? What is the difference from BACKGROUND from user's perspective? Is there any particular use case where it can be used? -Val On Fri, Mar 23, 2018 at 2:49 AM, Dmitry Pavlov wrote: Hi Ivan, IMO we have to add extra FSYNCS for BACKGROUND WAL. Agree? Sincerely, Dmitriy Pavlov пт, 23 мар. 2018 г. в 12:23, Ivan Rakov : Igniters, there's another important question about this matter. Do we want to add extra FSYNCS for BACKGROUND WAL mode? I think that we have to do it: it will cause similar performance drop, but if we consider LOG_ONLY broken without these fixes, BACKGROUND is broken as well. Best Regards, Ivan Rakov On 23.03.2018 10:27, Ivan Rakov wrote: Fixes are quite simple. I expect them to be merged in master in a week in worst case. Best Regards, Ivan Rakov On 22.03.2018 17:49, Denis Magda wrote: Ivan, How quick are you going to merge the fix into the master? Many persistence related optimizations have already stacked up. Probably, we can release them sooner if the community agrees. -- Denis On Thu, Mar 22, 2018 at 5:22 AM, Ivan Rakov < ivan.glu...@gmail.com> wrote: Thanks all! We seem to have reached a consensus on this issue. I'll just add necessary fsyncs under IGNITE-7754. Best Regards, Ivan Rakov On 22.03.2018 15:13, Ilya Lantukh wrote: +1 for fixing LOG_ONLY. If current implementation doesn't protect from data corruption, it doesn't make sence. On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda < dma...@apache.org> wrote: +1 for the fix of LOG_ONLY On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk < alexey.goncha...@gmail.com> wrote: +1 for fixing LOG_ONLY to enforce corruption safety given the provided performance results. 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov < voze...@gridgain.com : +1 for accepting drop in LOG_ONLY. 7% is not that much and not a drop at all, provided that we fixing a bug. I.e. should we implement it correctly in the first place we would never notice any "drop". I do not understand why someone would like to use current broken mode. On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov wrote: Hi, I think option 1 is better. As Val said any mode that allows corruption does not make much sense. What Ivan mentioned here as drop, in relation to old mode DEFAULT (FSYNC now), is still significant perfromance boost. Sincerely, Dmitriy Pavlov ср, 21 мар. 2018 г. в 17:56, Ivan Rakov < ivan.glu...@gmail.com : I've attached benchmark results to the JIRA ticket. We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of WAL compaction enabled flag. It's pretty significant drop: WAL compaction itself gives only ~3% drop. I see two options here: 1) Change LOG_ONLY behavior. That implies that we'll be ready to release AI 2.5 with 7% drop. 2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI 2.5 that we added power loss durability in default mode, but user may fallback to previous LOG_ONLY in order to retain performance. Thoughts? Best Regards, Ivan Rakov On 20.03.2018 16:00, Ivan Rakov wrote: Val, If a storage is in corrupted state, does it mean that it needs to be completely removed and cluster needs to be restarted without data? Yes, there's a chance that in LOG_ONLY all local data will be lost, but only in *power loss**/ OS crash*
[jira] [Created] (IGNITE-8042) .NET thin client: Add username/password to handshake
Vladimir Ozerov created IGNITE-8042: --- Summary: .NET thin client: Add username/password to handshake Key: IGNITE-8042 URL: https://issues.apache.org/jira/browse/IGNITE-8042 Project: Ignite Issue Type: Task Components: platforms, thin client Reporter: Vladimir Ozerov Fix For: 2.5 We added username/password authentication recently (IGNITE-7860). Need to optionally pass username and password in .NET thin client handshake. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration
Taras, Could you please chime in? On Mon, Mar 26, 2018 at 9:56 AM, Pavel Tupitsynwrote: > I've started this task, and the property name combined with lack of > javadoc seems confusing and misleading: > > * Turns out this authentication is only for thin clients > * Not clear how to configure and use it, even after digging through Jira > and devlist > > How do I write test to ensure it works? > > Thanks, > Pavel > > On Fri, Mar 23, 2018 at 6:44 PM, Pavel Tupitsyn > wrote: > >> Thanks, got it, will do. >> >> On Fri, Mar 23, 2018 at 4:36 PM, Dmitry Pavlov >> wrote: >> >>> Hi Pavel, >>> >>> Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436 >>> >>> Sincerely, >>> Dmitriy Pavlov >>> >>> пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn : >>> >>> > Please provide description in IGNITE-8034 and link Java-side ticket >>> there. >>> > >>> > On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn >>> > wrote: >>> > >>> > > Hi Vladimir, >>> > > >>> > > Can you provide more details? >>> > > * What does it do? >>> > > * Do we need to only propagate the flag to .NET or do anything else? >>> > > * Related ticket? >>> > > >>> > > Thanks, >>> > > Pavel >>> > > >>> > > On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov < >>> voze...@gridgain.com> >>> > > wrote: >>> > > >>> > >> Pavel, >>> > >> >>> > >> We introduced new flag IgniteConfiguration.authenticationEnabled >>> > recently. >>> > >> Would you mind adding it to IgniteConfigutation.cs [1]? >>> > >> >>> > >> Vladimir. >>> > >> >>> > >> [1] https://issues.apache.org/jira/browse/IGNITE-8034 >>> > >> >>> > > >>> > > >>> > >>> >> >> >
[GitHub] ignite pull request #3696: IGNITE-8034 .NET: Add IgniteConfiguration.Authent...
GitHub user ptupitsyn opened a pull request: https://github.com/apache/ignite/pull/3696 IGNITE-8034 .NET: Add IgniteConfiguration.AuthenticationEnabled You can merge this pull request into a Git repository by running: $ git pull https://github.com/ptupitsyn/ignite ignite-8034 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/3696.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3696 commit cab781b0559cce224e9ee33a43079ec2b42ed606 Author: Pavel TupitsynDate: 2018-03-26T06:40:26Z Add property commit 8959865697496c5bc26731faa456ac89cf46f647 Author: Pavel Tupitsyn Date: 2018-03-26T06:45:23Z read/write commit 886ba12d003ec21d981aeb38e961d04781358341 Author: Pavel Tupitsyn Date: 2018-03-26T06:46:43Z Rename for consistency commit e985e9f2ccf33950007c61503ae0c4ca162bd2fc Author: Pavel Tupitsyn Date: 2018-03-26T06:47:32Z update schema commit dbac34e2f1ca10eb657965350e81863978e701ec Author: Pavel Tupitsyn Date: 2018-03-26T06:48:19Z update schema commit 02bf36c9927dd50b95480ea58886d94b1526aa96 Author: Pavel Tupitsyn Date: 2018-03-26T06:48:56Z updating tests commit 8d8c8935fec50a1dd73717e9a1e7bb542676dcda Author: Pavel Tupitsyn Date: 2018-03-26T06:53:07Z updating tests commit 9ed071ce93ea6b2a88896f3b29cf94253aee7433 Author: Pavel Tupitsyn Date: 2018-03-26T06:58:06Z read/write in Java commit 94f83a1787bfbfd61fdd0f1274a7be5387d80261 Author: Pavel Tupitsyn Date: 2018-03-26T07:12:21Z tests work ---
Re: .NET: Add "authenticationEnabled" flag to IgntieConfiguration
I've started this task, and the property name combined with lack of javadoc seems confusing and misleading: * Turns out this authentication is only for thin clients * Not clear how to configure and use it, even after digging through Jira and devlist How do I write test to ensure it works? Thanks, Pavel On Fri, Mar 23, 2018 at 6:44 PM, Pavel Tupitsynwrote: > Thanks, got it, will do. > > On Fri, Mar 23, 2018 at 4:36 PM, Dmitry Pavlov > wrote: > >> Hi Pavel, >> >> Related ticket is https://issues.apache.org/jira/browse/IGNITE-7436 >> >> Sincerely, >> Dmitriy Pavlov >> >> пт, 23 мар. 2018 г. в 16:24, Pavel Tupitsyn : >> >> > Please provide description in IGNITE-8034 and link Java-side ticket >> there. >> > >> > On Fri, Mar 23, 2018 at 4:23 PM, Pavel Tupitsyn >> > wrote: >> > >> > > Hi Vladimir, >> > > >> > > Can you provide more details? >> > > * What does it do? >> > > * Do we need to only propagate the flag to .NET or do anything else? >> > > * Related ticket? >> > > >> > > Thanks, >> > > Pavel >> > > >> > > On Fri, Mar 23, 2018 at 2:25 PM, Vladimir Ozerov < >> voze...@gridgain.com> >> > > wrote: >> > > >> > >> Pavel, >> > >> >> > >> We introduced new flag IgniteConfiguration.authenticationEnabled >> > recently. >> > >> Would you mind adding it to IgniteConfigutation.cs [1]? >> > >> >> > >> Vladimir. >> > >> >> > >> [1] https://issues.apache.org/jira/browse/IGNITE-8034 >> > >> >> > > >> > > >> > >> > >