Re: [Maria-developers] Understanding binlog group commit (MDEV-232, MDEV-532, MDEV-25611, MDEV-18959)

2023-05-20 Thread Kristian Nielsen
Marko Mäkelä writes: >> recent binlog files. We want to know when scanning the old binlog file is no >> longer necessary; then we will log a CHECKPOINT_EVENT recording this fact. >> But we don't want to stall waiting for everything to be flushed to disk >> immediately; we can just log the

Re: [Maria-developers] Understanding binlog group commit (MDEV-232, MDEV-532, MDEV-25611, MDEV-18959)

2023-05-12 Thread Kristian Nielsen
Marko Mäkelä writes: > Do you know how MyRocks implements ACID? There is a function Sorry, I do not. Maybe Sergey Petrunya does. > That function does not currently take any parameter (such as THD) to > identify the transaction of interest, and it cannot indicate that the > most recent state

Re: [Maria-developers] Understanding binlog group commit (MDEV-232, MDEV-532, MDEV-25611, MDEV-18959)

2023-05-12 Thread Kristian Nielsen
Marko Mäkelä writes: > later. I understand that you were an early contributor to MariaDB and > implemented some durability I/O optimization between the "distributed > transaction" of binlog and the InnoDB write-ahead log (redo log, > ib_logfile0). Indeed, that was the group commit work:

Re: [Maria-developers] START ALTER replication test cases

2023-05-10 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Howdy Kristian, >> --connection master >> set global binlog_alter_two_phase=true; >> # ... >> ALTER TABLE t1 DROP PRIMARY KEY; >> ALTER TABLE t1 ADD UNIQUE KEY ui (i); >> ALTER TABLE t1 ADD PRIMARY KEY (i); >> >> Note the set *global*

[Maria-developers] START ALTER replication test cases

2023-05-06 Thread Kristian Nielsen
Hi Sachin, Andrei, I'm looking at test cases for START ALTER / Lag-free Alter on Slave. And I happened to notice something in test case rpl.rpl_start_alter_bugs: --connection master set global binlog_alter_two_phase=true; # ... ALTER TABLE t1 DROP PRIMARY KEY; ALTER TABLE t1 ADD

Re: [Maria-developers] d25439f1cbb: MDEV-31140: FLUSH BINARY LOGS DELETE_DOMAIN_ID=(D) can errorneously delete active domains

2023-05-02 Thread Kristian Nielsen
ay unnoticed. Yes, there is a good test coverage, so a bit unlucky, agree. - Kristian. > On the occasion of the International workers' day, let me wish you more > contribution, and inspiration for how to make not just this server > better alone :-)!!! Thanks, and all the best to you too :-) - Kri

Re: [Maria-developers] d25439f1cbb: MDEV-31140: FLUSH BINARY LOGS DELETE_DOMAIN_ID=(D) can errorneously delete active domains

2023-04-27 Thread Kristian Nielsen
the gtid_binlog_state with the GTID_LIST event was flawed, making it possible to delete a domain_id from the binlog state when this should not be allowed. - Kristian. Kristian Nielsen writes: > revision-id: d25439f1cbb79e5467b4249792130bfe524e3f10 > (mariadb-10.11.2-21-gd25439f1cbb) >

Re: [Maria-developers] 489a7fba324: MDEV-29322 ASAN heap-use-after-free in Query_log_event::do_apply_event

2022-09-04 Thread Kristian Nielsen
Kristian Nielsen writes: > BTW, the testcase only fails sporadically (without the fix), becase it > depends on whether the SQL thread has had time to read ahead to a new FD > when the CREATE TABLE t1 runs in the worker thread. There's a wait in the > testcase which seems to be intend

Re: [Maria-developers] 489a7fba324: MDEV-29322 ASAN heap-use-after-free in Query_log_event::do_apply_event

2022-09-04 Thread Kristian Nielsen
Hi Sergei, Andrei, The commit message had me confused at first - because there are mechanisms in parallel replication to ensure that things stay alive as long as needed; and because if description_event_for_exec points to invalid (freed) memory, then it would seem to indicate a deeper problem

Re: [Maria-developers] Move on with Gtid strict mode

2022-07-13 Thread Kristian Nielsen
Andrei Elkin writes: > Kristian, salve! Hi Andrei, thanks for interesting comments and discussion! >> 3. To solve (2), we recover the old master with >> --tc-heuristic-recover=rollback. This makes it rollback (=discard) any >> transaction that was not fully committed to disk (does it also roll

Re: [Maria-developers] Move on with Gtid strict mode

2022-07-11 Thread Kristian Nielsen
Hi Andrei, Good to hear from you, and to see that things are still going on with MariaDB GTID replication! I tried looking at the MDEVs that you referred to. Here is how I understand the motivation for this, correct me if I'm wrong: 1. We're considering a slave with "lossless semi-sync slave",

Re: [Maria-developers] 9ea85a70a75: MDEV-24654 GTID event falsely marked transactional

2022-01-11 Thread Kristian Nielsen
Sergei Golubchik writes: >> To the question of the usage of trX cache by non-trX let me answer >> broadly to mention @@binlog_direct_non_transactional_update = false >> leads to aggregation of mixed, say innodb + myisam, events in trx >> cache. > > That's different. The bug summary is "GTID

Re: [Maria-developers] [Commits] 9b999e79a35: MDEV-23108: Point in time recovery of binary log fails when sql_mode=ORACLE

2020-07-23 Thread Kristian Nielsen
sujatha writes: > DBA's enable sql_mode='ORACLE' when they would like to use ORACLE's > PL/SQL language. But this is a property of the individual query executed, not a global property of the binlog. The binlog will in general consist of a mix of queries that require sql_mode=oracle, and queries

Re: [Maria-developers] [Commits] 9b999e79a35: MDEV-23108: Point in time recovery of binary log fails when sql_mode=ORACLE

2020-07-19 Thread Kristian Nielsen
sujatha writes: > In MariaDB 10.3 and later, setting the sql_mode system variable to Oracle > allows the server to understand a subset of Oracle's PL/SQL language. When > sql_mode=ORACLE is set, it switches the parser from the MariaDB parser to > Oracle compatible parser. With this change

Re: [Maria-developers] [MariaDB/server] Proper locking for mysql.gtid_slave_pos truncation (84b437d)

2019-12-18 Thread Kristian Nielsen
Sergey Vojtovich writes: >> when all slave threads are stopped and nothing else is accessing the >> gtid_pos table. > It is a table, so any client connection can be accessing it any time? Yes. What I meant is that - mysql.gtid_slave_pos is a system table, it is not supposed to be modified from

Re: [Maria-developers] [MariaDB/server] Proper locking for mysql.gtid_slave_pos truncation (84b437d)

2019-12-17 Thread Kristian Nielsen
Sergey Vojtovich writes: > ATTN @dr-m, @andrelkin, @SachinSetiya, @knielsen So IIUC, this is about incorrect usage of ha_truncate() in rpl_slave_state::truncate_state_table(). This is used only for SET GLOBAL gtid_slave_pos = "..." when all slave threads are stopped and nothing else is

Re: [Maria-developers] f72427f463d: MDEV-20923:UBSAN: member access within address Б─╕ which does not point to an object of type 'xid_count_per_binlog'

2019-12-17 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Sujatha, Kristian, howdy. Hi Andrei! > Kristian, the question was about `xid_count_per_binlog' struct > comment -- .. > > > struct xid_count_per_binlog : public ilink { > char *binlog_name; > uint binlog_name_len; > ulong binlog_id; > /* Total

Re: [Maria-developers] Missing memory barrier in parallel replication error handler in wait_for_prior_commit()?

2019-10-11 Thread Kristian Nielsen
sujatha writes: > I have a doubt. A simple fix as per the earlier mail discussion would > be to swap the > > order of assignments as shown below. > > wakeup_error= true > waitee= NULL > Why cannot we use the simpler approach. Please provide your inputs. This is because of the need for memory

Re: [Maria-developers] [Commits] 673e2537249: Fix missing memory barrier in wait_for_commit

2019-09-26 Thread Kristian Nielsen
69e). - Kristian. Kristian Nielsen writes: > revision-id: 673e253724979fd9fe43a4a22bd7e1b2c3a5269e > (mariadb-10.4.4-333-g673e2537249) > parent(s): 8887effe13ad87ba0460d4d3068fb5696f089bb0 > author: Kristian Nielsen > committer: Kristian Nielsen > timestamp: 2019-09-26 17:43:26

Re: [Maria-developers] [Commits] e07caf401c2: MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group.

2019-09-23 Thread Kristian Nielsen
sujatha writes: > revision-id: e07caf401c26cf8144899336d103e4c7aafd3d7a > (mariadb-10.1.41-45-ge07caf401c2) > MDEV-20645: Replication consistency is broken as workers miss the error > notification from an earlier failed group. Great that you could come up with a testcase like this to trigger

[Maria-developers] Missing memory barrier in parallel replication error handler in wait_for_prior_commit()?

2019-09-22 Thread Kristian Nielsen
Hi Andrei (Cc: Sujatha), I noticed another thing with the wait_for_commit error handling while looking at the MDEV-18648 patch. We have code like this: // Wakeup code in wait_for_commit::wakeup(): mysql_mutex_lock(_wait_commit); waitee= NULL; this->wakeup_error= wakeup_error; // Wait

Re: [Maria-developers] [Commits] cde9170709c: MDEV-18648: slave_parallel_mode= optimistic default in 10.5

2019-09-22 Thread Kristian Nielsen
sujatha writes: > Thank you for the review comments. You are right. Setting > rgi->worker_error=1 > for the 2 is the right way to handle. With this, upon reaching > 'finish_event_group' > 2nd will notify 3rd transaction that something went wrong during prior > commit > execution. Ok, great if

Re: [Maria-developers] [Commits] cde9170709c: MDEV-18648: slave_parallel_mode= optimistic default in 10.5

2019-09-20 Thread Kristian Nielsen
sujatha writes: Hi Sujutha, > @sql/sql_class.h > Moved 'wait_for_prior_commit(THD *thd)' method inside sql_class.cc > > @sql/sql_class.cc > Added code to check for 'stop_on_error_sub_id' for event groups which get > skipped > and don't have any preceding group to wait for. This looks like the

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state

2019-07-18 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > where master maintains an "immediate" (without binlog proxy) > constant connection to slaves (when it breaks the slave would have to > take snapshot, or find a binlog service e.g on some other slave). > [In such a case the master connection would still collect

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state

2019-07-17 Thread Kristian Nielsen
Andrei Elkin writes: > I would also raise another but relevant topic of maintaining > >gtid_"executed"_pos > > which is an union of all GTID executed regardless of their arrival > method. E.g some of foreign (to the recipient server) domains gtid:s may > The master potentially could be

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state

2019-07-15 Thread Kristian Nielsen
Hi Andrei! The @@gtid_current_pos exists for one sole purpose. This is to let the user promote a slave as the new master and attach the old master as a slave to the new master. By using master_use_gtid=current_pos, the exact same command can be used to attach a slave to the new master,

Re: [Maria-developers] [Commits] 7cabdc461b2: MDEV-6860 Parallel async replication hangs on a Galera node

2019-07-15 Thread Kristian Nielsen
Sachin Setiya writes: > On Mon, Jul 15, 2019 at 4:00 PM Kristian Nielsen > wrote: >> (I wonder if this isn't just another symptom of the underlying problem that >> Galera has never been integrated properly into MariaDB and the group commit >> algorithm / transaction

Re: [Maria-developers] [Commits] 7cabdc461b2: MDEV-6860 Parallel async replication hangs on a Galera node

2019-07-15 Thread Kristian Nielsen
sachin.set...@mariadb.com writes: > revision-id: 7cabdc461b24fdebe599799d7964efa4b53815e3 > (mariadb-10.1.39-91-g7cabdc461b2) > > MDEV-6860 Parallel async replication hangs on a Galera node > > Wait for previous commit beore preparing next transation for galera > diff --git

Re: [Maria-developers] [Commits] 8bfb140d5dc: Move deletion of old GTID rows to slave background thread

2018-12-07 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: >> Note that the cleanup happens asynchroneously, and system load can cause the >> cleanup step to be delayed or > > >even skipped completely in rare cases; > > I only can think of crashes here... Anything else do you mean? There is a small race in the code

Re: [Maria-developers] [Commits] 8bfb140d5dc: Move deletion of old GTID rows to slave background thread

2018-12-06 Thread Kristian Nielsen
Hi Andrei, Thanks for review! I rebased the patch on 10.4, ran it through another buildbot run, and pushed it to 10.4. I think with this patch I'll close MDEV-12147, ok? I wrote up the below documentation, I'm planning on adding it to the knowledgebase, unless it is better to send it to someone

Re: [Maria-developers] [Maria-discuss] Pipeline-stype slave parallel applier

2018-11-25 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > We can notice that in the pipeline method any attempt to start waiting for a > granted lock > to another branch needs interception and erroring out. > We are bound to this because the grated lock is to be released only after 2pc. So this is essentially what

Re: [Maria-developers] [Maria-discuss] Pipeline-stype slave parallel applier

2018-11-21 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > I am yet to check closer thd_rpl_deadlock_check(), perhaps that's why Oh. I definitely think you should get well acquainted with existing parallel replication in MariaDB to undertake a project as ambitious as this one. There are a number of central concepts and

Re: [Maria-developers] [Maria-discuss] Pipeline-stype slave parallel applier

2018-11-19 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Your comments, thoughts and critical notes are most welcome! Thanks for the interesting idea and detailed description, Andrei! I have written some initial comments inline, below: > entering the binlog group (ordered) commit module. They can end up > into binlog

Re: [Maria-developers] MDEV-15740 (InnoDB lost Durability due to incorrect fix of MDEV-11937)

2018-11-02 Thread Kristian Nielsen
Marko Mäkelä writes: > Hi Kristian, > I was under the impression that it affects normal InnoDB too. > But indeed, it looks like you could be right that it is only affecting Galera. Right. The idea is that one of these two cases apply: 1. There is 2-phase commit (eg. between InnoDB and

Re: [Maria-developers] MDEV-15740 (InnoDB lost Durability due to incorrect fix of MDEV-11937)

2018-11-01 Thread Kristian Nielsen
Marko Mäkelä writes: > Could you please take a look at this InnoDB regression in MariaDB 10.2 > and later: > https://jira.mariadb.org/browse/MDEV-15740 > > Because of this bug, we can no longer write tests that would ensure > that certain changes are present in the redo log, before killing and

Re: [Maria-developers] [Commits] 2f4a0c5be2c: Fix accumulation of old rows in mysql.gtid_slave_pos

2018-10-13 Thread Kristian Nielsen
Hi Andrei! I have now pushed the patch to 10.1 (and merged to 10.3 since the merge interacts with the per-engine mysql.gtid_slave_pos feature and requires non-obvious code and testcase changes). And I've started working on the improved 10.3 patch where the deletion is moved to the replication

Re: [Maria-developers] [Commits] 2f4a0c5be2c: Fix accumulation of old rows in mysql.gtid_slave_pos

2018-10-12 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > The 10.1 patch is good. Thanks! > > Maybe moving deletions from record_gtid() into the background thread is too > > big a change for 10.1/10.2 ? But we could use the current patch for 10.1, > > and do the more complicated patch only in 10.3 (or 10.4?). > > > >

Re: [Maria-developers] [Commits] 2f4a0c5be2c: Fix accumulation of old rows in mysql.gtid_slave_pos

2018-10-11 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Why won't we defer the current eager/optimistic old sub-id records > discard in rpl_slave_state::record_gtid() > mysql_mutex_lock(_slave_state); > if ((elist= elem->grab_list()) != NULL) > { > /* Delete any old stuff, but keep around the most recent

Re: [Maria-developers] [Commits] 2f4a0c5be2c: Fix accumulation of old rows in mysql.gtid_slave_pos

2018-10-08 Thread Kristian Nielsen
to interaction with @@gtid_pos_auto_engines. Github branch here: https://github.com/knielsen/server/commits/gtid_table_garbage_rows - Kristian. Kristian Nielsen writes: > revision-id: 2f4a0c5be2c5d5153c4253a49ba8820ab333a9a0 > (mariadb-10.1.35-71-g2f4a0c5be2c) >

Re: [Maria-developers] [Commits] 0f97f6b8398: MDEV-17346 parallel slave start and stop races to workers disappeared

2018-10-05 Thread Kristian Nielsen
Hi Andrei! andrei.el...@pp.inet.fi writes: > revision-id: 0f97f6b8398054ccb0507fbacc76c9deeddd47a4 > (mariadb-10.1.35-71-g0f97f6b8398) > author: Andrei Elkin > timestamp: 2018-10-03 15:42:12 +0300 > message: > > MDEV-17346 parallel slave start and stop races to workers disappeared Ooh, that's

Re: [Maria-developers] Updated Gtid_slave_pos of Untracked domain creates skipped events

2018-07-21 Thread Kristian Nielsen
Sachin Setiya writes: >>> I think that sounds like a very bad idea. The current_pos/slave_pos is the >>> single biggest source of confusion regarding GTID. (In fact, I think it >>> would be best to deprecate/eventually remove current_pos). Better not add >>> to >>> the confusion... >>> >>> If we

Re: [Maria-developers] Updated Gtid_slave_pos of Untracked domain creates skipped events

2018-07-20 Thread Kristian Nielsen
Sachin Setiya writes: > This issue is regarding Mdev-9107 , where we have 3 master master with > do_domain_ids of > I have created a abstract test case for this probem. > (3) > ^ ^ >

Re: [Maria-developers] [Commits] af15686: MDEV-16242: MyRocks: parallel slave on a table without PK can stop with ER_KEY_NOT_FOUND

2018-07-05 Thread Kristian Nielsen
Sergey Petrunia writes: > Btw, I also have figured that MyRocks wasn't making thd_rpl_deadlock_check() > calls and added these: > http://lists.askmonty.org/pipermail/commits/2018-June/012653.html > http://lists.askmonty.org/pipermail/commits/2018-June/012652.html > but I'm still in the process

Re: [Maria-developers] [Commits] af15686: MDEV-16242: MyRocks: parallel slave on a table without PK can stop with ER_KEY_NOT_FOUND

2018-07-05 Thread Kristian Nielsen
pser...@askmonty.org (Sergei Petrunia) writes: > MDEV-16242: MyRocks: parallel slave on a table without PK can stop with > ER_KEY_NOT_FOUND > > DRAFT: If RBR event applier uses a secondary key or a full table scan > to locate a row, force waiting for prior commit to complete. I think if you

Re: [Maria-developers] Conservative parallel slave is "too optimistic" for certain engines

2018-05-31 Thread Kristian Nielsen
Sergey Petrunia writes: > == Symptoms == > When one runs a parallel slave (mode=conservative) and replicates DML for > MyRocks table without a Primary Key, replication may stop with a > ER_KEY_NOT_FOUND error. > == A detail about conservative replication == > The remaining case is > -

Re: [Maria-developers] [Fixed Commit] MDEV-12746 out-of-order retry

2018-02-17 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > The improved patch is here for more comments if you will have. The new patch looks fine to me. - Kristian. ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net

Re: [Maria-developers] [Commit] MDEV-12746 out-of-order retry

2018-02-13 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > First the parent errros out goes to `finish_event_group()' but it's > possible it does not have yet the child in its `subsequent_commits_list' > So I understood so far that the retrying worker needs to check > `stop_on_error_sub_id' that effectively reflects

Re: [Maria-developers] Mdev-10664 Add statuses about optimistic parallel replication stalls.

2018-02-10 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > I see Kristian was somewhat sceptical about using 'group' as an offical > term, but that's the most natural way to name the object at hand. Yeah... but I cannot think of a better term to use here, and "event group" is the proper term for what is

Re: [Maria-developers] [Commit] MDEV-12746 out-of-order retry

2018-02-10 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > commit 3cebb54e6387a7eace1757c82ed0efd6e11590b9 > Author: Andrei Elkin > Date: Fri Feb 9 15:00:23 2018 +0200 > > MDEV-12746 rpl.rpl_parallel_optimistic_nobinlog fails committing >out of order at retry > >

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state

2017-11-28 Thread Kristian Nielsen
I am sure you can find some who would want something that ignores replicated GTIDs that duplicate GTIDs originating locally. I can only say that my experience is that this can cause unexpected problems, and requires a lot of thought to get a well-defined semantics that users can understand and

Re: [Maria-developers] Interaction between rpl_slave_state and rpl_binlog_state

2017-11-28 Thread Kristian Nielsen
Sachin Setiya writes: > I have some question related to rpl_slave_state. Suppose A circular > async replication between A < -- > B (gtid_ignore_duplicates on) Why do you set gtid_ignore_duplicates? This option is for multi-source replication:

Re: [Maria-developers] [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-29 Thread Kristian Nielsen
>> If you “forget" the domain on the upstream server what happens if >> there >> are downstream slaves? I think you’ll break replication if they >> disconnect >> from this box and try to reconnect. Their GTID information will no >> longer match. >> IMO and if I’ve understood correctly this is

Re: [Maria-developers] [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-21 Thread Kristian Nielsen
Simon, thanks for your detailed answer. I see your point on having access to powerful tools when they are needed, even when such tools can be dangerous when used incorrectly. It reminds me of the old "goto considered harmful" which I never agreed with. It occurs to me that there are actually

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-14 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Then a function to discard a domain term would do: > > SET @@gtid_binlog_state=gtid_discard_domain(@@gtid_binlog_state,'d') > > While this time it would be new object introduced still it's of var > setting semantics and might be generally useful too. I'm not

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-13 Thread Kristian Nielsen
Andrei Elkin writes: > And really why not > > 3. SET @@GLOBAL.gtid_binlog_state=list-without-d; The main issue I see with this if the master is actively adding new transactions (in other domains than d). It will be hard for the user to know the right value to set

Re: [Maria-developers] [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-12 Thread Kristian Nielsen
Simon Mudd writes: > ids. Obviously once all appropriate bin logs have been purged > (naturally by other means) then no special processing will be needed. Right. Hence my original idea (which was unfortunately never implemented so far). If at some point a domain has been

Re: [Maria-developers] A problem with implementing Group Commit with Binlog with MyRocks

2017-09-11 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: >> 2. To make START TRANSACTION WITH CONSISTENT SNAPSHOT actually correctly >> synchronise snapshots between multiple storage engines (MySQL does not have >> this, I think). > > (Offtopic, but anyway what it is? Multi-engine transaction with this > specific

Re: [Maria-developers] A problem with implementing Group Commit with Binlog with MyRocks

2017-09-11 Thread Kristian Nielsen
Kristian Nielsen <kniel...@knielsen-hq.org> writes: > single thread. In MySQL, _both_ prepare and commits are so grouped from a > single thread (though I think one thread can do group prepare in parallel > with another doing group commit). Ehm, this is not true, of course. The

Re: [Maria-developers] A problem with implementing Group Commit with Binlog with MyRocks

2017-09-11 Thread Kristian Nielsen
Sergey Petrunia writes: > == Some background == > > "group commit with binlog" feature needs to accomplish two goals: > > 1. Keep the binlog and the storage engine in sync. > storage_engine->prepare(sync=true); > binlog->write(sync=true); > storage_engine->commit(); > >

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-08 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > 1. Take note of @@global.gtid_binlog_state > 2. Ensure that all slaves are past the last event of being deleted domain 'd' > 3. PURGE BINARY LOGS DELETE DELETE 'd' > > The effect of the last step would include purging all the binary log > files plus a planned

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-07 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > If my concern is practical we may consider *optionally* strict > delete domain FLUSH LOGs. The errored out version would maintain a In that case, I would compare to SET GLOBAL gtid_binlog_state. Currently, this is even more restricted, it is only allowed when

Re: [Maria-developers] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

2017-09-06 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes: > Let me propose methods to clean master off unused gtid domains. > I would be glad to hear your opinions, dear colleagues. So a bit of background: The central idea in MariaDB GTID is the sequence of events that created the current master state. This is an

Re: [Maria-developers] commit_checkpoint_request() vs. thd_get_durability_property() (in relation to MDEV-11937)

2017-08-08 Thread Kristian Nielsen
; the affected test files. The Oracle policy should be to only withhold test > files for security bugs; we should take full advantage of the public tests. > > Best regards, > > Marko > > On Mon, Aug 7, 2017 at 2:25 PM, Jan Lindström <jan.lindst...@mariadb.com> > wrot

[Maria-developers] commit_checkpoint_request() vs. thd_get_durability_property() (in relation to MDEV-11937)

2017-08-07 Thread Kristian Nielsen
Monty asked me to fix MDEV-11937. This particular one is a performance regression in InnoDB commit. But there is a wider problem that I thought I should explain, so it can be perhaps avoided in the future. MariaDB and MySQL use different mechanisms for storage engines to avoid having to fsync

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-07-11 Thread Kristian Nielsen
umented. I hope this helps. Feel free to ask again if you have more questions or if something was unclear, and thanks for helping improve the documentation. - Kristian. > > > On 03/07/2017 15:15, Kristian Nielsen wrote: >> I have now pushed the code to 10.3. It should appear in an upc

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-07-03 Thread Kristian Nielsen
I have now pushed the code to 10.3. It should appear in an upcoming MariaDB 10.3.1 release, IIUC. Following the discussion so far, the default for --gtid-pos-auto-engines is currently empty. It can be easily changed later (eg. to innodb,tokudb,rocksdb) simply by changing the default value in

Re: [Maria-developers] MDEV-12179: Per-engine mysql.gtid_slave_pos: auto-configuring/packaging

2017-06-26 Thread Kristian Nielsen
Sergey Petrunia writes: > Suppose there is a transactional storage engine that is shipped as a loadable > module. The examples are MyRocks and TokuDB. > > I think it the default behavior for such engine after MDEV-12179 should be > that > the engine is listed in

Re: [Maria-developers] Race Condition in Seconds behind master.

2017-05-02 Thread Kristian Nielsen
Sachin Setiya writes: > Then I executed Show slave status (couple of times in a second), and > found Seconds_behind_master is changing arbitrarily I have always found the semantics of Seconds_behind_master complex to understand. I wrote an analysis some time ago:

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-04-26 Thread Kristian Nielsen
I have now most of the implementation of MDEV-12179 done. I wanted to present the way the feature now looks, and point to the code, in case there are any further comments on the design or implementation before it is finalised. To recap, the idea is to improve performance when using multiple

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-03-09 Thread Kristian Nielsen
Jonas Oreland writes: > how about > --gtid_auto_create_engine_list= > > default value = innodb,tokudb > > (and the stored proc) Yes, I like that a lot better, thanks for the suggestion. And following Jean-François' suggestion, I guess the default would be empty in the first

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-03-08 Thread Kristian Nielsen
jocelyn fournier writes: > Why not using the XA support flag of the engine to check if we should > create the gtid_slave_pos_{engine} file ? The only additional engine currently that supports XA is Spider, if my grep skills are not failing me. But Spider does not

Re: [Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-03-08 Thread Kristian Nielsen
Will Fong writes: > What about a configuration setting in my.cnf that will do the above magically? > > I would prefer not making direct changes to the mysql database. I agree that this is not ideal. But do you have a suggestion for how the semantics of such an option

[Maria-developers] RFC: new replication feature "per-engine mysql.gtid_slave pos"

2017-03-06 Thread Kristian Nielsen
I plan to implement MDEV-12179, per-engine mysql.gtid_slave_pos. Here is a description of the high-level design, as a request for comments and/or suggestion for changes. The purpose of this is to fix a serious performance issue in replication when multiple storage engines are used. Every

Re: [Maria-developers] Mdev-10664 Add statuses about optimistic parallel replication stalls.

2017-01-26 Thread Kristian Nielsen
Sachin Setiya writes: >> Why did you decide to put this information into a status variable? >> Normally, >> slave status is seen in SHOW SLAVE STATUS, which supports showing status >> for >> a particular connection without using @@default_master_connection. >> >> Sorry

Re: [Maria-developers] Fwd: Total No of events in event group.

2017-01-10 Thread Kristian Nielsen
Sachin Setiya writes: > I have created a patch for this mdev. but I was thinking for monitoring > command 'SHOW PROCESSLIST' > in Progress column can we show how much % of the events from Event group > Slave has applied so that > user can get a better monitoring. > If

Re: [Maria-developers] Fwd: Total No of events in event group.

2017-01-10 Thread Kristian Nielsen
Sachin Setiya writes: > Okay, So it there a better way to show progress in slave ? You have not explained what you mean by "showing progress". Doesn't SHOW SLAVE STATUS already show event-by-event progress in the relay log position? For progress inside a row event,

Re: [Maria-developers] Total No of events in event group.

2017-01-10 Thread Kristian Nielsen
Sachin Setiya writes: > Can we do something like send total no of Rows_log_events in a event > group to slave. We can write this info in Gtid_log_event. Reason for > doing this is that on slave we can show how much progress we have > made. My immediate impression is

Re: [Maria-developers] Mdev-10664 Add statuses about optimistic parallel replication stalls.

2017-01-08 Thread Kristian Nielsen
Sachin Setiya writes: > To my surprise slave_parallel_mode is replication channel specific , > while slave_parallel threadsis a global variable, Why So ? There is a single pool of worker threads, to enable threads to be shared among multi-source connections - for

Re: [Maria-developers] Mdev-10664 Add statuses about optimistic parallel replication stalls.

2017-01-08 Thread Kristian Nielsen
Sachin Setiya writes: > I am stuck at one problem in Mdev-10664. Suppose there is multisource > replication('master1' and 'master2' ) > and we want to update status var, How to know which master_info to > update ?. Does slave threads have current > replication channel

Re: [Maria-developers] TRUE LOCK=NONE for slave

2016-12-29 Thread Kristian Nielsen
Sachin Setiya writes: > I am thinking of implementing true lock=none for slaves. Currently Can you explain what you mean by "true LOCK=NONE"? It is not clear from your description. I think you mean something that will allow to run an ALTER TABLE on the slave in

Re: [Maria-developers] [Commits] 9416779: MDEV-11636 Extra persistent columns on slave always gets NULL in RBR

2016-12-27 Thread Kristian Nielsen
abc writes: > I have updates the patch, please have a look, > revision-id: 941677928aa40aa1f5abf981b74d2c3c80441459 > (mariadb-galera-10.0.28-8-g9416779) > parent(s): be430b80df0cdd4eba32df1570195721dbfd1b39 > author: Sachin Setiya > committer: Sachin Setiya >

Re: [Maria-developers] [Commits] b60fb6d: MDEV-11636 Extra persistent columns on slave always gets NULL in RBR

2016-12-22 Thread Kristian Nielsen
sachin.set...@mariadb.com writes: > revision-id: b60fb6daee2fcc2f433a50b5a5639065f5a46fe8 > (mariadb-galera-10.0.28-8-gb60fb6d) > parent(s): be430b80df0cdd4eba32df1570195721dbfd1b39 > author: Sachin Setiya > committer: Sachin Setiya > timestamp: 2016-12-22 19:22:36 +0530 > message: > >

Re: [Maria-developers] Fix for TokuDB and parallel replication

2016-12-09 Thread Kristian Nielsen
jocelyn fournier writes: > It seems the entry corresponds to a table having a trigger, don't know > if it could have an impact on the parallel replication ? Triggers should work ok, otherwise it would be a bug. But it's hard to tell what is going on without a way

Re: [Maria-developers] Fix for TokuDB and parallel replication

2016-12-09 Thread Kristian Nielsen
jocelyn fournier writes: > I've just tried your tokudb_optimistic_parallel_replication branch, > and it behaves very strangely: the SQL thread stop by itself without > any replication error when the parallel_mode is set to optimistic. That's strange, the log looks

Re: [Maria-developers] 1579140: MDEV-11005: Incorrect error message when using ONLINE alter table with GIS

2016-11-30 Thread Kristian Nielsen
Sergei Golubchik writes: > ER_SQL_SLAVE_SKIP_COUNTER_NOT_SETTABLE_IN_GTID_MODE, the next error > message, was added in 10.0. If you add a new error message here, then > the error number for ER_SQL_SLAVE_SKIP_COUNTER_NOT_SETTABLE_IN_GTID_MODE > will change. But 10.0 is GA, and

[Maria-developers] Fix for TokuDB and parallel replication

2016-11-28 Thread Kristian Nielsen
Parallel replication so far did not work well with TokuDB, as some people who tried it found out. I have now pushed to 10.1 some patches to solve the problems. There are two main fixes: 1. Fix some races where a waiting transaction would miss its wakeup and get a lock timeout on a waiting row

Re: [Maria-developers] Mdev-10715 -- Galera: Replicate MariaDB GTID to other nodes in the cluster

2016-11-15 Thread Kristian Nielsen
Sachin Setiya writes: > sachin_setiya_7: so maybe the problem is - that a node > broadcasts its write set before the commit order has been determined? > > I do not think , this is the problem. Galera enforces the commit order. > Yes, it broadcast write set in prepare

Re: [Maria-developers] Deleting unused branches on github

2016-11-07 Thread Kristian Nielsen
encryption 1 year, 1 month ago Jan Lindström <jan.lindst...@mariadb.com> origin/bb-10.1-default1 year, 3 months ago Monty <mo...@mariadb.org> origin/10.0-FusionIO-Galera 1 year, 4 months ago Jan Lindström <jan

[Maria-developers] Deleting unused branches on github

2016-11-07 Thread Kristian Nielsen
t;jan.lindst...@mariadb.com> origin/10.0-custombld 1 year, 4 months ago Kristian Nielsen <kniel...@knielsen-hq.org> origin/10.1-window1 year, 4 months ago Vicentiu Ciorbaru <vicen...@mariadb.org> origin/10.0-FusionIO

Re: [Maria-developers] security spring cleaning in MariaDB org on github

2016-11-07 Thread Kristian Nielsen
Sergei Golubchik writes: > But I certainlly trust you to be one of them, so if you'd want have > owner access for mariadb org on github, you can have it, I think. That > would mean actually using it, making changes as needed, on a regular > basis. Well, I'm happy to help with

Re: [Maria-developers] security spring cleaning in MariaDB org on github

2016-11-06 Thread Kristian Nielsen
Sergei Golubchik writes: > And owners are much too powerful to be treated lightly > https://help.github.com/articles/permission-levels-for-an-organization/ > Because that was a small admin task, something similar is done almost > every second day. I see. I am sad - and hurt

Re: [Maria-developers] security spring cleaning in MariaDB org on github

2016-11-05 Thread Kristian Nielsen
Sergei Golubchik writes: > If you think you need admin access, please request it (again). Yes, please restore my access to the repo. I use it regularly, to work with web hooks, see how the repo is setup, etc. > we're performing some spring cleaning in this area. Who are

Re: [Maria-developers] [MariaDB/server] MDEV-11065 - Compressed binary log (#247)

2016-11-03 Thread Kristian Nielsen
vinchen writes: > The new code is here: > https://github.com/vinchen/server/commits/GCSAdmin-10.2-binlog-compressed2-2 > And added two fixed: > 1.Avoid overflowing buffers in case of corrupt events > 2.Check the compressed algorithm. Looks fine, thanks for

Re: [Maria-developers] Patch for unaligned word access in CONNECT storage engine

2016-10-28 Thread Kristian Nielsen
>>> Is this just a sort of precaution or did bus errors effectively occured? I've included James Cowgill who saw the problem and made the patch. James, do you have a stacktrace showing a bus error, to narrow down where exactly the code is doing unaligned accesses? > What is strange is that

[Maria-developers] Patch for unaligned word access in CONNECT storage engine

2016-10-21 Thread Kristian Nielsen
Who should be contacted about issues in the CONNECT storage engine? The attached patch is from Debian Bug#838914 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838914 Apparently the code does direct unaligned accesses of word data. This works fine on the x86 architecture, but on some other

[Maria-developers] Patch for atomics on MIPS in mroonga

2016-10-21 Thread Kristian Nielsen
Who should be contacted about issues in the mroonga storage engine? The attached patch is from Debian Bug#838914 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838914 Apparently, libatomic is needed on this platform to support 64-bit atomic operations. The patch looks reasonable and

Re: [Maria-developers] [MariaDB/server] Compressed binary log (#247)

2016-10-21 Thread Kristian Nielsen
Simon Mudd writes: >> This would result in higher overhead on each event. There is a fixed header > Ok. I’ve been assuming the headers were small (from some casual browsing of > things > related to the binlog router some time ago), but that may be wrong. Yes, they are

Re: [Maria-developers] [MariaDB/server] MDEV-11064 - Restrict the speed of reading binlog from Master (#246)

2016-10-21 Thread Kristian Nielsen
vinchen writes: > cli_safe_read_reallen() and and my_net_read_packet_reallen() is a good > way to fix the ABI problem. I will fix it like this. > > And the minimum precision is second in slave_sleep(), and it also a > the mutex. I think it is too heavy in most case. (It

Re: [Maria-developers] [MariaDB/server] Compressed binary log (#247)

2016-10-21 Thread Kristian Nielsen
Simon Mudd writes: > I have not looked at the code in detail but a couple of questions > related to the implementation come to mind: The author is probably the best to answer, but I can tell what I know from reviewing the patch: > (1) Why not use a single compressed

Re: [Maria-developers] [MariaDB/server] Compressed binary log (#247)

2016-10-20 Thread Kristian Nielsen
GCSAdmin writes: > We add new event types to support compress the binlog as follow: > QUERY_COMPRESSED_EVENT, > WRITE_ROWS_COMPRESSED_EVENT_V1, > UPDATE_ROWS_COMPRESSED_EVENT_V1, > DELETE_POWS_COMPRESSED_EVENT_V1, > WRITE_ROWS_COMPRESSED_EVENT,

  1   2   3   4   5   6   7   8   >