Re: [Maria-discuss] Backup on the replication server getting affected
ragul rangarajan writes: > Indeed the environment where we are able to see the issue is in *MariaDB > 10.6.10 *and using pool-of-threads. Cool, thanks ragul, that confirms that your issue is caused by the MDEV-29843. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Backup on the replication server getting affected
ragul rangarajan writes: > Hope my issue is more related to the issue MDEV-30780 optimistic parallel > slave hangs after hit an error > Trying to reproduce with a minimal database. > > Attaching the gbd output Thanks, that gdb output is really helpful! I agree with Andrei that this rules out MDEV-30780 as the cause. Instead it looks to be caused by MDEV-29843, see also MDEV-31427: https://jira.mariadb.org/browse/MDEV-29843 https://jira.mariadb.org/browse/MDEV-31427 This is seen in the stack trace, where all the other worker threads are waiting on one which is stuck inside pthread_cond_signal: --- Thread 80 (Thread 0x7f47ad065700 (LWP 25417)): #0 0x7f789dca054d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7f789dc9e14d in pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #2 0x55de401c23cd in inline_mysql_cond_signal (that=0x7f4798006b78) at /home/buildbot/buildbot/build/include/mysql/psi/mysql_thread.h:1099 #3 dec_pending_ops (state=, this=0x7f4798006b30) at /home/buildbot/buildbot/build/sql/sql_class.h:2535 #4 thd_decrement_pending_ops (thd=0x7f47980009b8) at /home/buildbot/buildbot/build/sql/sql_class.cc:5142 #5 0x55de407b5726 in group_commit_lock::release (this=this@entry=0x55de41f0da80 , num=num@entry=216757233923465) at /home/buildbot/buildbot/build/storage/innobase/log/log0sync.cc:388 #6 0x55de407a0a3c in log_write_up_to (lsn=, lsn@entry=216757233923297, flush_to_disk=flush_to_disk@entry=false, rotate_key=rotate_key@entry=false, callback=, callback@entry=0x7f47ad064090) at /home/buildbot/buildbot/build/storage/innobase/log/log0log.cc:844 --- The pthread_cond_signal() function normally can never block, so this indicates some corruption of the underlying condition object. This object is used to asynchroneously complete a query on a client connection when using the thread pool. The MDEV-29843 patch makes worker threads not use this asynchroneous completion, which should eliminate this problem. The stack trace strongly indicates MDEV-29843 as the cause. Except that MDEV-29843 patch is supposed to be in MariaDB 10.6.11, and you wrote: > Environment: MariaDB 10.6.11 Can you double-check if you are really seing this hang in 10.6.11, or if it could have been 10.6.10 (the only version that is supposed to be vulnerable to MDEV-29843)? Another thing you can check is if you are using --thread-handling=pool-of-threads, which I think is related to the MDEV-29843 issue. In MDEV-31427 I suggest --thread-handling=one-thread-per-connection as a possible work-around. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Privilege Question
Scott Canaan writes: > Thank you. I found SUPER, but was trying to avoid using it as it > gives too many privileges. I was looking for something more > fine-grained. Maybe you can define a stored procedure with SQL SECURITY DEFINER (and a DEFINER with the SUPER priviledge) that sets the desired syslog global system variables. Then you can grant the ITS_READ account access to the stored procedure, which will give access only to set the syslog configuration. Hope this helps, - Kristian. > On Apr 06, Scott Canaan wrote: >> We are on MariaDB 10.5.18. There is a requirement to send all syslog >> data to a central syslog server. In the past, we did it using a login >> called ITS_READ. It has limited privs on purpose, but used to be able >> to execute the SET GLOBAL statements that we needed. Those statements >> are: >> >> SET GLOBAL server_audit_output_type=SYSLOG; SET GLOBAL >> server_audit_logging=1; SET GLOBAL >> server_audit_syslog_facility=LOG_LOCAL2; >> SET GLOBAL server_audit_events="connect,table,query_ddl,query_dcl"; ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Code of Conduct
Andrew Hutchings writes: > Instead the intention is to discourage people from personal attacks at > each other, which negatively affects the group as a whole. It doesn't affect anyone at all expect those that choose to read them and have nothing better to spend their time on than react to them and feed the trolls. The way to improve the mailing lists is to move the traffic from mariadb.com / mariadb.org internal mailing lists/private mail onto the public lists. Not to put up systems to reduce the trafic even further (however low the value of such traffic may currently be). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] MariaDB master-slave chained replication and parallelism
andrei.el...@pp.inet.fi writes: > And the above transition can be explained by > MDEV-24654 GTID event falsely marked transactional, its patch is under > review. Oh, yes, this bug sounds like it could result in what Jan described. It was not clear to me from the bug description exactly under what conditions the bug occurs, but if the first slave marks the replicated transactions as "transactional" in its binlog, then the observed behaviour could occur. The question then is how the chained slaves manage to run MyISAM transactions in parallel without getting conflicts and hanging. One possibility is that these are mostly insert-only queries (as Jan mentioned in another mail), and I believe that MyISAM has the feature that MyISAM can handle insert-only queries in parallel without locks and conflicts. Would require a bit more research to be sure this is the explanation, but it seems a possibility. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] MariaDB master-slave chained replication and parallelism
Jan Křístek writes: > We have a MariaDB 10.3 replication setup with one master and a few chained > slaves (each has log_slave_updates switched on). Master uses mainly MyISAM > tables, slaves have about 10 or 40 threads for parallel replication. > > Interesting is, that the first slave in the chain counts replicated > statements into Non-Transactional Groups and the following ones count them > into Transactional Groups. Interesting. Where do you see these counts? My guess is that these are counting the "transactional" status flag on each GTID event in the binlog. You can see these yourself in a mysqlbinlog output from a binlog on the master respectively the slaves. #190606 19:42:35 server id 1 end_log_pos 514 GTID 0-1-2 trans If these show non-transactional on the master but transactional on the first slave, it sounds like you are replicating from MyISAM tables on the master to InnoDB tables on the slave. Try SHOW CREATE TABLE t on a relevant table on the master and the slave and see which storage engine they are using. > Also, when checking process lists it seems that just one statement is being > processed at the time (of the many threads) on the first slave, while there > are multiple slave replication statements being executed on the 2nd and > following slaves. This observation matches the theory that the tables are MyISAM on the master but InnoDB on the slaves. MariaDB parallel replication has limited capabilities in parallelising MyISAM changes. The main algorithms are based on optimistic apply, where transactions are run in parallel by default, and any conflicts are handled by rollback and retry. This is possible in InnoDB but not MyISAM. And the transactional status is checked on the table engine used on the master, not the slave. Thus, the first slave sees MyISAM changes, and does not do parallel operation, but writes InnoDB transactions. These InnoDB transactions are then seen by following slaves which enables the parallel replication algorithms. > Please, does anyone know the reason why the replicated statements are > counted into different groups? Or, more importantly, how to increase the > parallelism on the first slave in the chain? The obvious answer is to change the tables to be InnoDB on the master. Which may or may not be possible in your setup. A possibly crazy/theoretical idea would be to setup the first slave with the blackhole engine for all tables. This requires statement-based replication and doesn't store _any_ data on the slave, just passes statements through to the next slave in line. There's an old idea to use the blackhole engine in this way as a "replication relay", and IIRC the blackhole engine is transactional. Not sure if this would actually work though, would require careful testing and is definitely not a supported configuration, I would say (but fun to think about). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] GTID and missing domain
mari...@biblestuph.com writes: > And my primary server has: > > gtid_binlog_pos 0-303-67739600,1-303-7363061243,100-303-4338582 > > gtid_binlog_state > 0-302-67690294,0-301-67719794,0-303-67739600,1-301-7350472534,1-302-7350381758,1-303-7363061243,100-302-4242958,100-301-4332195,100-303-4338582 > > set global gtid_slave_pos = '1-303-7360639083,100-303-4337869'; > start slave; > Got fatal error 1236 from master when reading data from binary log: > 'Could not find GTID state requested by slave in any binlog files. > Probably the slave state is too old and required binlog files have > been purged. > > Even though I'm positive there are no domain 0 transactions (again, > hasn't been in service for years). Yes. You write that "there are no domain 0 transactions". But from the point of view of the database, there _are_ domain 0 transactions, even though they may be long in the past. These are seen in gtid_binlog_pos (and gtid_binlog_state). When your slave has the 0-domain in the gtid_slave_pos, the master knows that the slave is missing no transactions. When you delete the 0-domain from the slave, this is the same conceptually as saying the slave is missing _all_ transactions in domain 0, and the master must send them all (or error out if they have been purged, as here). In general, when a slave connects, the master needs to send all transaction in a domain that the slave did not apply yet - otherwise the slave will be missing transactions and have the wrong data. This holds regardless of how old those missing transactions might be. If a slave connects two years after last being active, the system should still give a reasonable error, not silently let the slave continue with incorrect data. That is why you get the error. > if I: > > FLUSH BINARY LOGS DELETE_DOMAIN_ID=(0) > > on the master, would I then be able to connect to it via > > set global gtid_slave_pos = '1-303-7360639083,100-303-4337869'; Yes. With this command, we are re-defining the history of the master to say that there were never any transactions in domain 0. Therefore, any slave that connects cannot be missing any such transactions. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] fsync alternative
Marko Mäkelä writes: > There is some ongoing development work around this area. If the binlog > is enabled, it should actually be unnecessary to persist the storage > engine log, because it should be possible to replay any > not-committed-in-engine transactions from the binlog. We must merely Nice to hear that this is being worked on. There is an old worklog MWL#164 with some analysis of potential issues to be solved. http://worklog.askmonty.org/worklog/Server-RawIdeaBin/?tid=164 It becomes tricky in some corner cases, for example cross-engine transactions where one engine has the changes persisted after a crash and the other does not. But the impact of a robust implementation of this could be huge, double-fsync-per-commit is _really_ expensive. Hopefully the corner cases can be solved or handled with some kind of fall-back. > But, InnoDB’s use of fsync() on data files feels like an overkill. I > believe that we only need some 'write barriers', that is, some This is also quite interesting. My (admittedly limited) understanding is that disks in fact have write-barrier functionality, and that journalling file systems in fact use that. The problem seems to be how to expose that to userspace. I wonder if there are any existing or proposed interfaces to allow userspace to specify write barriers between writes. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] fsync alternative
Jure Sah writes: > It would appear that on a typical webserver, the majority of disk i/o > belongs to the MariaDB database process. It would appear it's mostly > waiting on fsync(), which to my understanding is executed at least once > per commit. Group commit can amortise this if there are multiple commits in parallel, but essentially yes. > I've also noticed in the documentation that the options to control fsync > usage are even more limited than in the MySQL server. They are also very > strongly argued against. Considering the point that InnoDB is considered > to be in an inconsistent state in any event, so long as the server is > not cleanly stopped, is there really justification for such strong > opposition here? Usually you can just set --sync-binlog=0 --innodb-flush-log-at-trx-commit=2 and most fsync should be gone. > I understand that this is extensively researched in the documentation > and it has to do with the recovery of data in case of an unexpected > server reboot. InnoDB should recover safely in any case. But if binlog is enabled, the binlog is likely to become out of sync with the data inside innodb (in case of kernel crash/power outage. Mariadb process crash by itself is not enough to break recovery). So if the server is a replication master, slaves might need to be rebuilt. Whatever is argued in one place or another, the better approach is to read docs on what each option actually does, and make your own trade-off, in this case between performance and recoverability. Which is exactly what you did, concluding that running without fsync is the right choice in your setup. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid and current_pos vs slave_pos
mari...@biblestuph.com writes: > I have four servers all running 10.3 as follows: > >A <=> B => C => D > and C is a master to D. In addition to their actual replicating DBs, > all four servers also have a "norep" DB that is used to create > temporary tables for local report processing (as well as any other > possible writes we might want to make locally without affecting the > slave chain). Historically we've prevented replication for the norep > DB via: > > replicate_ignore_db = mysql,norep > replicate_wild_ignore_table = mysql.%,norep.% > binlog_do/ignore flavor of filters are all unset). Writes to the A and > B servers are programmatically controlled such that only one of the > two servers will accept writes at any given moment. > Specifically, when I look at the gtid_slave_pos on server D, which I > thought was only supposed to reflect transactions that were actually > replicated, I sometimes see statements coming from server C; these are > temporary tables being written into norep on C. They are not actually > replicating on D (at least as far as I can tell), and they don't show up > D's binary log. So why would they be reflected in D's gtid_slave_pos? The gtid_slave_pos on D is the current position _within the binlog of C_ (C being the master of D). The filtering you set up happens on the slave side D, not on the master side C. So even the norep transactions on C are still "replicated" in the sense that they are sent to D and processed (including updating the gtid_slave_pos value). The filtering just causes skipping the actual changes to tables or data. If D happens to disconnect from C at the point of a "norep" transaction, it will need to restart from that position when it reconnects later. > For example, just a moment ago SELECT @@GLOBAL.gtid_slave_pos on D > showed this: > > 1-303-48758339 > > This transaction does not appear in D's binlog, which I would expect > since it should not in fact actually be replicated. But because it is > reflected in gtid_slave_pos, it seems to me that in my setup I cannot > reliably use gtid_current_pos or gtid_slave_pos, since either may at > any given time point to an entry on C that of course won't exist on B > should I ever want to redirect D to B. Yes. Using replicate_ignore_db is not appropriate for doing local changes on one server that should be invisible to the replication chain. So this will not work, as you suspect. The simplest way is to just set sql_log_bin=0 when doing local transactions on a slave - this avoids the statements being written to the binlog in the first place. No replicate_ignore_db options are needed then. It's possible you can achieve something similar using binlog_ignore_db instead (I don't 100% recall all details, but from the documentation it looks like it might work). Your current setup is effectively multi-master from the point of view of GTID (all servers written concurrently), even though you then replicate_ignore_db changes from all but one server. As described in the documentation, GTID can handle multi-master setups using gtid_domain_id, but I think that is much more complicated than needed for your actual usecase. Just using sql_log_bin=0 (or possibly binlog_ignore_db) should be fine. > DROP TEMPORARY TABLE IF EXISTS `norep`.`locations` /* generated by server */ > /*!*/; > > How is it that that statement made it all the way through to server D > from B? Shouldn't it have been filtered out by server C? I vaguely recall an old bug that causes in particular redundant DROP TEMPORARY TABLES statement to be unnecessarily written to the binlog. Maybe this bug is still there and causing this. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Post-MySQL(5.6) to MariaDB migration question - why are master_info_repository=TABLE and relay_log_info_repository=TABLE not supported?
Artem Russakovskii writes: > Thank you for the explanation. Helpful. I'm guessing once all slaves and > then the master are converted to mariadb, global transaction IDs are going > to start getting used (or maybe I'll need to tweak some variables). Because > right now it's empty on the one slave I converted. The first time the mariadb slave connects to the migrated mariadb master, the slave obtains the GTID corresponding to the current position. Then you can switch the slave to use GTID for future connections: CHANGE MASTER TO master_use_gtid=slave_pos - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Post-MySQL(5.6) to MariaDB migration question - why are master_info_repository=TABLE and relay_log_info_repository=TABLE not supported?
Artem Russakovskii writes: > Upon further analysis, it turned out to be the lack of support of > master_info_repository=TABLE and relay_log_info_repository=TABLE in > mariadb, which means the master information effectively disappeared as far > as the slave server is concerned. > the values fished out from the slave_master_info table), it also seems to > be a step back when it comes to crash-safe replication. > > Does anyone have an explanation for why we're now back to master.info and > relay-log.info on disk rather than nice tables in memory? In MariaDB, the replication position is stored crash-safe in a table (mysql.gtid_slave_pos) when using MariaDB global transaction ID. One problem with the way the MySQL relay_log_info_repository=TABLE feature is designed is that it makes it impossible for two transactions to update their position simultaneously. Thus it doesn't work well with parallel replication. That's one reason it is implemented differently in MariaDB. I agree it is unfortunate that this breaks mysql->mariadb migrations. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
Hi Reinis, I have now pushed a fix for this. I expect it will be included in the next release. Once again thanks for taking the time to do a good error report, glad to get this fixed. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
>> In any case, thanks Reinis for taking the time to report this serious issue, >> I'll see >> if I can come up with a patch to fix the problem. > > Thx and looking forward to it. I have now committed a patch that should fix this. If you want to try it, you can find it here: https://github.com/knielsen/server/commit/3eb2c46644b6ac81e7e5e79c9c120700a48d8071 Or else this will hopefully make it into a coming 10.3 release, I've asked Andrei Elkin to review it. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
"Reinis Rozitis" writes: > Is there a jira/github issue I could follow? I can put any updates in the MDEV-12147. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
Kristian Nielsen writes: > (Hm. Actually... if a conflict is detected _after_ the transaction has > deleted old rows from the mysql.gtid_slave_pos table, then the deletions > will be rolled back along with the conflicting transaction, and it seems we > will get old rows left-over just as you see... if that is what is happening After some tests, it seems this is indeed what is happening. Whenever a conflict in optimistic parallel replication is detected late in the execution of the conflicting transaction, rows in the mysql.gtid_slave_pos table can be left undeleted, as you see. This goes back all the way to 10.1. I'm somewhat sad to see a bug like this surface only now, it would appear that optimistic parallel replication is not much used? Or maybe the fact that the table will be cleared on server restart has made people just live with it? In any case, thanks Reinis for taking the time to report this serious issue, I'll see if I can come up with a patch to fix the problem. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
Kristian Nielsen writes: > It is cool that optimistic replication works well in your setup to avoid > slave lag (but not cool that it causes this problem). I will try to see if I > can reproduce with simple test-generated traffic. But if you know of a way I > could reproduce that would be useful. I seem to be able to reproduce easily with just the standard testcase in the source tree. Ie. I added a select from mysql.gtid_slave_pos to tokudb_rpl.rpl_parallel_optimistic and I see extra rows at the end of the test: select * from mysql.gtid_slave_pos order by domain_id, sub_id; domain_id sub_id server_id seq_no 0 8 1 8 0 11 1 11 0 12 1 12 0 70 1 70 0 71 1 71 0 73 1 73 0 126 1 126 0 127 1 127 But adding debug printout, I can see the rows being deleted: delete -1-8 sub=8 committing... So somehow the delete is getting lost afterwards, I'll try to dig a bit deeper. But I should have the info from you I need for now, thanks for reporting this. If you want a work-around for now, then it should be ok to periodically delete (eg. a cron job) all rows in mysql.gtid_slave_pos except the one with the highest sub_id within each domain_id. (Hm. Actually... if a conflict is detected _after_ the transaction has deleted old rows from the mysql.gtid_slave_pos table, then the deletions will be rolled back along with the conflicting transaction, and it seems we will get old rows left-over just as you see... if that is what is happening here, then that seems quite a serious bug, and I wonder how it has been able to go undetected for so long... or maybe something else is going on). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] gtid_slave_pos row count
Thanks for the additional info. TokuDB probably isn't the most used engine with optimistic parallel replication. I worked with a TokuDB developer some time ago to make it work, but I'm not sure how well those fixes are maintained in MariaDB (ie. I tried now running the testcase tokudb_rpl.rpl_parallel_optimistic on 10.3.9, it fails, seemingly because it hasn't even been run after some changes to the test framework). However, I do not see why TokuDB would cause the behaviour you describe, the maintenance of mysql.gtid_slave_pos is not storage-engine dependent and you tested with the table using InnoDB. It would be useful to see if this problem is also present with earlier versions of MariaDB on the slave. The GTID position is always maintained by MariaDB, that is why you see this despite not actively using GTID. I also do not see how 10.2.4 master could influence the problem. The table data you sent show rows that are seemingly randomly selected from transactions, and it only occurs in optimistic mode. That suggests the problem is related to when a transaction is optimistically run in parallel and causes a conflict and needs to be rolled back. Maybe this rollback becomes ineffective in mysql.gtid_slave_pos for some reason? Though again I don't immediately see how this could be. It is cool that optimistic replication works well in your setup to avoid slave lag (but not cool that it causes this problem). I will try to see if I can reproduce with simple test-generated traffic. But if you know of a way I could reproduce that would be useful. - Kristian. "Reinis Rozitis" writes: >> Do you have any errors in the error log about failure to delete rows? > > Nope, no errors. > > >> Anything else special to your setup that might be causing this? > > At some point I thought maybe the tokudb_analyze_in_background / > tokudb_auto_analyze messes things up as it does the background check (you can > also see here the row count growing): > > 2018-09-29 11:05:48 134488 [Note] TokuDB: Auto scheduling background analysis > for ./mysql/gtid_slave_pos_TokuDB, delta_activity 423840 is greater than 40 > percent of 1059601 rows. - succeeded. > 2018-09-29 11:09:35 134490 [Note] TokuDB: Auto scheduling background analysis > for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424359 is greater than 40 > percent of 1060885 rows. - succeeded. > 2018-09-29 11:13:23 134488 [Note] TokuDB: Auto scheduling background analysis > for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424888 is greater than 40 > percent of 1062196 rows. - succeeded. > > (it triggers also in conservative mode but then it happens just because of a > single row being >40% of the table) > > I tried to switch off the gtid_pos_auto_engines to use a single gtid_pos > InnoDB table and it makes no difference - in conservative mode everything is > fine in optimistic the table fills up. > > > > The odd thing is that I'm actually not using gtid for the replication: > > MariaDB [mysql]> show slave status\G > > Slave_IO_State: Waiting for master to send event >Master_Host: 10.0.8.211 >Master_User: repl >Master_Port: 3306 > Connect_Retry: 60 >Master_Log_File: mysql-bin.096519 >Read_Master_Log_Pos: 79697585 > Relay_Log_File: db-relay-bin.000142 > Relay_Log_Pos: 78464847 > Relay_Master_Log_File: mysql-bin.096519 > Slave_IO_Running: Yes > Slave_SQL_Running: Yes >Replicate_Do_DB: >Replicate_Ignore_DB: > Replicate_Do_Table: > Replicate_Ignore_Table: >Replicate_Wild_Do_Table: >Replicate_Wild_Ignore_Table: > Last_Errno: 0 > Last_Error: > Skip_Counter: 0 >Exec_Master_Log_Pos: 79697245 >Relay_Log_Space: 595992008 >Until_Condition: None > Until_Log_File: > Until_Log_Pos: 0 > .. > Seconds_Behind_Master: 0 > Master_SSL_Verify_Server_Cert: No > Last_IO_Errno: 0 > Last_IO_Error: > Last_SQL_Errno: 0 > Last_SQL_Error: >Replicate_Ignore_Server_Ids: > Master_Server_Id: 211 > Master_SSL_Crl: > Master_SSL_Crlpath: > Using_Gtid: No >Gtid_IO_Pos: >Replicate_Do_Domain_Ids: >Replicate_Ignore_Domain_Ids: > Parallel_Mode: optimistic > SQL_Delay: 0 >SQL_Remaining_Delay: NULL >Slave_SQL_Running_State: Slave has read all relay log; waiting for the > slave I/O thread to update it > Slave_DDL_Groups: 25 > Slave_Non_Transactional_Groups: 284193 > Slave_Transactional_Groups: 452098720 > > > > The other "special" thing maybe is that the master is still 10.2.4 - but that
Re: [Maria-discuss] gtid_slave_pos row count
"Reinis Rozitis" writes: > the table starts to grow continuously: > > MariaDB [mysql]> select count(*) from gtid_slave_pos; > +--+ > | count(*) | > +--+ > | 5577268 | > +--+ > 1 row in set (1.553 sec) That definitely look bad. As you say, there can be multiple rows in the table, but it should be the same order of magnitude as @@slave_parallel_threads, not millions. Do you have any errors in the error log about failure to delete rows? Anything else special to your setup that might be causing this? Can you share the contents of the mysql.gtid_slave_pos table when this happens? > Is there something wrong with the purger? > (something similar like in https://jira.mariadb.org/browse/MDEV-12147 ? ) That bug is rather different - the row count in the table is not growing, but number of unpurged rows is. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Replication Problem
Thomas Plant writes: > Your tip with 'MASTER_USE_GTID=slave_pos' fixed it.Will have to look > better in the documentation next time, never found the 'slave_pos' > mentioned or 'SET sql_log_bin=0'. > > Thank you very much for your help. Welcome, glad that you solved your problem. The slave_pos/current_pos is confusing a lot of users. It would probably have been better if current_pos had never been introduced. > so you mean that disabling the binary log on the slave would be > indicated? Can I do it while it is online? SET sql_log_bin=0 disables binlogging only for the following queries done in that connection (not in general). So yes, it can be done online. You would do it for queries that you will not want replicated to other servers. For example, if you later make this slave the master, and put the old master to replicate from the old slave (now new master), you probably do not want to replicate your earlier slave-fixup-query. Hence the suggestion to SET sql_log_bin=0 to avoid having this query in the slave binlog. But if using MASTER_USE_GTID=slave_pos, in most cases it won't matter one way or the other. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Replication Problem
Thomas Plant writes: > Today I had time to look at the error, removed the duplicate ID from > the table and started the slave thread again using 'start slave;'. > > But now I get another error: > > Last_IO_Error: Got fatal error 1236 from master when reading data from > binary log: 'Error: connecting slave requested to start from GTID > 0-2-2948175468, which is not in the master's binlog. Since the > master's binlog contains GTIDs with higher sequence numbers, it > probably means that the slave has diverged due to executing extra > erroneous transactions' So it seems you are using MariaDB Global Transaction ID with MASTER_USE_GTID=current_pos, and you forgot to do the duplicate ID removal under `SET sql_log_bin=0`. The easiest solution is probably to CHANGE MASTER TO MASTER_USE_GTID=slave_pos. This should make the slave ignore the local transaction and just connect to the master using the last replicated position. (current_pos tells the MariaDB server that you expect any local transactions on the slave to also be replicated to other servers, hence the error. current_pos is appropriate for an earlier master that is turned into a slave, but not for a slave where local "fixup" transactions ended up in the binlog). Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Replication New to Old
andrei.el...@pp.inet.fi writes: > Mike, > >> Hello, >> >> I realize that in general replication from a newer master to an older >> slave is typically not recommended. This said, does anyone have an >> experience replicating from MariaDB 10.2 to MySQL 5.6? > > A problem that is evident at once is 10.2 GTID events can not be handled > by 5.6. So at least some filtering should be devised. 10.2 does not send GTID to a slave that does not understand it (they are rewritten on-the-fly to BEGIN query events). So GTID events should not cause 5.6 slave to break. More generally, the code in MariaDB (at least the code that I wrote) detects what capabilities the slave has, and avoids sending stuff from the master that an old slave will not understand. See MARIA_SLAVE_CAPABILITY_* in log_event.h. So the intention is that replication to old slave should work. However, this still requires that applications restrict themselves from using any SQL not supported on the old slave. And it is only poorly tested, if at all. Hence the recommendation to avoid new master->old slave. Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Duplication of documentation for binlog annotation and a question on replicate_annotate_row_events not being dynamic
"Jean-Francois B. Gagne"writes: > And the technical question about replicate_annotate_row_events: this > variable is not dynamic, is there a reason for that ? I understand that > this variable could/should only be modifiable while the slave is stopped, > but not being dynamic is not very DBA/SysAdmin/Operator friendly. It looks like there is no reason for it. The only place the variable is used is during slave connect to master (in request_dump()). So the variable could even be completely dynamic (no need to have slave stopped), though it will only take effect after slave IO thread reconnect. I suspect that this one-liner would work just fine to make the variable dynamic: diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc index de054f3..641d7a5 100644 --- a/sql/sys_vars.cc +++ b/sql/sys_vars.cc @@ -5316,7 +5316,7 @@ static Sys_var_mybool Sys_replicate_annotate_row_events( "replicate_annotate_row_events", "Tells the slave to write annotate rows events received from the master " "to its own binary log. Ignored if log_slave_updates is not set", - READ_ONLY GLOBAL_VAR(opt_replicate_annotate_row_events), + GLOBAL_VAR(opt_replicate_annotate_row_events), CMD_LINE(OPT_ARG), DEFAULT(TRUE)); #endif - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Fix different gtid-positions on domain 0 in multi-master
Reinder Cuperuswrites: > The problem is, as soon as I stop that connection, that master2 and > master3 have different gtid-positions for domain0, and stop/start on > replication master3->backup results in the error: > "Got fatal error 1236 from master when reading data from binary log: > 'Error: connecting slave requested to start from GTID 0-1-3898746614, > which is not in the master's binlog' Yes. backup sees that it is ahead of master3 in domain 0, so it aborts to avoid risk of diverging replication. > I have tried moving master1/2 to domain_id:1, and removing the > domain_id:0 from the gtid_slave_pos on backup, but starting the > replication master2->backup results in the error: > Got fatal error 1236 from master when reading data from binary log: > 'Could not find GTID state requested by slave in any binlog files. > Probably the slave state is too old and required binlog files have been > purged.' Yes. Because backup now sees that it is far behind in domain 0 (as it sees the world), and aborts to not silently lose transactions. > I tried finding a way to purge domain:0 from master3/master4, but the > only way sofar I have found is doing a "RESET MASTER" on master3, which > would break replication between master3 and master4. Yes, I guess this is what you need. You have made a copy and removed half of the data, and now you need to similarly remove half of the binlog. Even if there are no actual transactions left from a domain in non-purged binlogs, the binlogs still remember the history of all domains, in order to not silently lose transactions for a slave that gets far behind. It would be useful in general to be able to purge a domain from a binlog. But currently the only way I can think of is RESET MASTER. You can see how this binlog history looks by checking @@gtid_binlog_state, and in the GTID_LIST events at the head of each binlog file. > I have tried to find a way to insert an empty transaction, with the last > gtid on domain_id:0 on the master3, to bring master2/master3 in sync > again on that domain, but I could not find a way to do that on MariaDB. The server will not binlog an empty transaction, but a dummy transaction should work, eg. create and drop a dummy table, for example: CREATE TABLE dummy_table (a INT PRIMARY KEY); SET gtid_domain_id= 0; SET gtid_server_id= 1; SET gtid_seq_no= 3898746614; DROP TABLE dummy_table; Maybe this way you can make the binlogs look like they are in sync to the replication, not sure. It might be tricky, but then you do seem to have a good grasp of the various issues involved. > Are there other ways to fix this issue, so I can have reliable > replication master3->backup without having to keep the dummy replication > backup->master3 indefinitely? I guess you would need to stop traffic to master3/master4 while getting them in sync with one another and the do RESET MASTER on both and SET GLOBAL gtid_slave_pos="" to start replication from scratch. You would then also need to have server 'backup' up-to-date with master3 before RESET MASTER, and remove domain id 20 from the gtid_slave_pos on backup after the RESET MASTER. So that is quite intrusive. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Reg replication and commit
Karthick Subramanianwrites: > Below when I try at Slave DB: > > MariaDB [dr_repl]> select * from test_dr_repl; > Empty set (0.00 sec) > > MariaDB [dr_repl]> commit; > Query OK, 0 rows affected (0.00 sec) > > MariaDB [dr_repl]> select * from test_dr_repl; > ++--+ > | id | val | > ++--+ > | 1 |1 | > | 2 |2 | > | 3 |3 | > ++--+ > 3 rows in set (0.00 sec) I wasn't really able to fully understand your explanation of your problem. However, the above suggests you have an open transaction with isolation level REPEATABLE READ. This is the only situation I can think of where a COMMIT will affect the visibility of other rows. When you open a transaction with REPEATABLE READ (with BEGIN, or with autocommit off), no new changes will be visible until COMMIT or ROLLBACK. This is a basic feature of InnoDB transactions, independent of replication. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Semi-sync replication hangs when changing binlog filename.
Joseph Glanvillewrites: > This fixes the problem for me. How do we go about getting this into a release? I can push it. It should go into 10.1, I think (this code is not in 10.0). Thanks for testing! - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Semi-sync replication hangs when changing binlog filename.
Pavel Ivanovwrites: > binlog ending at position mariadb-bin.04:2039896, somehow the > function ReplSemiSyncMaster::commitTrx() gets trx_wait_binlog_name = > 'mariadb-bin.05' and trx_wait_binlog_pos = 2039896. I.e. the > function gets the position of the transaction to wait semi-sync ack > for correctly, but the file name is already the one that is current > after rotation. Master starts waiting for that position, but the slave > Kristian, do you have any idea what's going on? Is there an > inappropriate lock release/re-acquire somewhere? Hm. Actually, looking into MYSQL_BIN_LOG::trx_group_commit_leader, this looks suspicious: RUN_HOOK(binlog_storage, after_flush, (current->thd, current->cache_mngr->last_commit_pos_file, current->cache_mngr->last_commit_pos_offset, synced, first, last)) But RUN_HOOK(binlog_storage, after_sync, (current->thd, log_file_name, current->cache_mngr->last_commit_pos_offset, first, last)) I would have expected that `log_file_name' to be also current->cache_mngr->last_commit_pos_file, like in the first instance. And in fact, it looks like (with my limited knowledge of semi-sync) that this suspicious case is exactly the AFTER_SYNC which fails, while AFTER_COMMIT works... So maybe try the below patch? Pavel, what do you think, do you agree that this patch should be better? - Kristian. diff --git a/sql/log.cc b/sql/log.cc index 7efec98..b77a6b3 100644 --- a/sql/log.cc +++ b/sql/log.cc @@ -7712,7 +7712,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader) last= current->next == NULL; if (!current->error && RUN_HOOK(binlog_storage, after_sync, - (current->thd, log_file_name, + (current->thd, current->cache_mngr->last_commit_pos_file, current->cache_mngr->last_commit_pos_offset, first, last))) { ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Rich, Cool, thanks for the pointers, that looks very helpful. I'll try to see if I can come up with something. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Kristian Nielsen <kniel...@knielsen-hq.org> writes: > Rich Prohaska <prohas...@gmail.com> writes: > >> Is TokuDB supposed to call the thd report wait for API just prior to a >> thread about to wait on a tokudb lock? > > If I wanted to look into implementing this, do you have a quick pointer to > where in the TokuDB code I could start looking? Like the place where lock > waits are done? (I have not worked with the TokuDB source before, though I I took just a quick look at the code, in particular lock_request.cc: int lock_request::start(void) { txnid_set conflicts; r = m_lt->acquire_write_lock(m_txnid, m_left_key, m_right_key, , m_big_txn); if (r == DB_LOCK_NOTGRANTED) { It seems to me that at this point in the code, what is required is to call thd_report_wait_for() on each element in the set conflicts, and that should be about it. Some mechanism will be needed to get from TXNID to THD, of course. A more subtle problem is how to ensure that those THDs cannot go away while iterating? I'm not familiar with what kind of inter-thread locking is used around TokuDB row locks. But it looks like a proof-of-concept patch for TokuDB optimistic parallel replication might be fairly simple to do. I also noticed that TokuDB does not support handlerton->kill_query() (so KILL cannot break a TokuDB row lock wait). That should be fine, the KILL will be handled when the wait finishes (or if _all_ transactions are waiting on the row locks of each other, then a normal TokuDB deadlock detection will handle things). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Rich Prohaskawrites: > Is TokuDB supposed to call the thd report wait for API just prior to a > thread about to wait on a tokudb lock? If I wanted to look into implementing this, do you have a quick pointer to where in the TokuDB code I could start looking? Like the place where lock waits are done? (I have not worked with the TokuDB source before, though I am somewhat familiar with the concept of how it works.) - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Rich Prohaskawrites: > Is TokuDB supposed to call the thd report wait for API just prior to a > thread about to wait on a tokudb lock? Yes, that's basically it. Optimistic parallel replication runs transactions in parallel, but enforces that they commit in the original order. So suppose we have transactions T1 followed by T2 in the replication stream, and that they try to update the same row. When T2 gets ready to commit, it needs to wait for T1 to commit first (this is what you see in wait_for_prior_commit()). However, if T1 is waiting on a row lock held by T2, we have a deadlock. thd_report_wait_for() checks for this condition. If a transaction goes to wait on a lock held by a later (in terms of in-order replication) transaction, the later transaction is killed (using the normal thread kill mechanism). Parallel replication then gracefully handles the kill (by rollback and retry). You can see in storage/xtradb/lock/lock0lock.cc how this is done for InnoDB/XtraDB, eg. lock_report_waiters_to_mysql(). Hopefully it would be easy to hook this into TokuDB. It does require being able to locate the transaction (and in particular the THD) that owns a given lock. Another potential issue (at least it was for InnoDB/XtraDB) is that thd_report_wait_for() can call back into the handlerton->kill_query method, so the callor of thd_report_wait_for() needs to be prepared for this to happen. Note that we can modify/extend the thd_report_wait_for() API to work better for TokuDB, if necessary. The current API was deliberately left "internal" (not a service with public headerfile etc.) in anticipation that it might need changing to better support other storage engines, such as TokuDB. Also note that the call to thd_report_wait_for() does not need to happen "just prior" to the lock wait - it can happen later, as long as it happens at some point (though of course the earlier the better, in terms of more quickly resolving the deadlock and allowing replication to proceed). > I have been running sysbench oltp with a mariadb 10.1 master-slave > topology. I have not seen any replication errors when slave parallel mode > is conservative. No, it should not happen, because in conservative mode transactions are not run in parallel on a slave unless they ran without lock conflicts on the master (both transactions reached the commit point at the same time). But in InnoDB/XtraDB, there are some interesting (but very rare) corner cases where two transactions may or may not have lock conflicts depending on the exact order of execution. So for these cases, the thd_report_wait_for() mechanism is also needed. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
Do you have a test case that can be used to repeat the bug? - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
jocelyn fournierwrites: > Thanks for the quick answer! I wonder if it would be possible the > automatically disable the optimistic parallel replication for an > engine if it does not implement it ? That would probably be good - though it would be better to just implement the necessary API, it's a very small change (basically TokuDB just needs to inform the upper layer of any lock waits that take place inside). However, looking more at your description, you got a "key not found" error. Not implementing the thd_report_wait_for() could lead to deadlocks, but it shouldn't cause key not found. In fact, in optimistic mode, all errors are treated as "deadlock" errors, the query is rolled back, and run again, this time not in parallel. So I'm wondering if there is something else going on. If transactions T1 and T2 run in parallel, it's possible that they have a row conflict. But if T2 deleted a row expected by T1, I would expect T1 to wait on a row lock held by T2, not get a duplicate key error. And if T1 has not yet inserted a row expected by T2, then T2 would be rolled back and retried after T1 has committed. The first can cause deadlock, but neither case seems to cause duplicate error. Maybe TokuDB is doing something special with locks around replication, or something else goes wrong. I guess TokuDB just hasn't been tested much with parallel replication. Does it work ok when running in conservative parallel mode? - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?
jocelyn fournierwrites: > After upgrading from TokuDB Enterprise with MariaDB 5.5 to MariaDB > 10.1.14, I tried to enable the parallel replication > (parallel_mode=optimistic, slave_parallel_threads=4) on a GTID enabled > Is this a known limitation with TokuDB ? Yes. TokuDB does not (to my knowledge) implement the thd_report_wait_for() API, which is what makes optimistic parallel replication work. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Long email about a replication issue
writes: > This weekend I had to repair a replication problem on one of our > clusters. I've attempted to get to the root cause but not sure where I Is this pure MariaDB replication, or is it Galera? I think it is the former, but the term "cluster" is somewhat overloaded, which is why I ask... > The setup is M1<->M2 (with attached slaves). M1 is the active master > receiving all writes. Access is controlled through an F5 and I don't > think any errant transactions have occurred on the inactive master > (M2). I've checked this by grepping the binlogs for the M2 server_id. > > The initial associated record that broke replication was attached to a > "user" table record. This user was created on Friday at > 16:21PM. Replication broke around 11:30PM that night. The user record > had a GTID of GTID 0-1-36823254 (recovered from M1) > > I've looked into the appropriate binlog from M2... > If I grep for the specific GTID on M2 I get nothing... > If I grep for this record by email address I also get nothing. So I > must conclude this record (and a bunch of others), never got to master > until replication broke due to the FK errors. You would expect > replication to break here because of a gap in the GTIDs. This did not > happen and I'm almost certain that GTID replication could not have > been deactivated and the positions messed around with. Yeah, even if the slave was set to MASTER_USE_GTID=no, the GTIDs should still have been there in the M2 binlog. > I'm unsure of where to go now. Any ideas? Any thoughts are appreciated. I guess you need to figure out why M2 did not apply those transactions. Some suggestions: - Check the error log on M2 for disconnect/reconnects around the time of the transactions that are missing (or any disconects/reconnects). Such messages should also say at what position M2 disconnected and reconnected, this could be compared to the problem GTID. This could show if transactions were skipped because of reconnecting at a wrong position. - Also check for local slave stop/start message in the M2 error log, to see if anything looks related or could indicate changes in the replication config (most replication changes require stopping the slave threads). - You can also check the binlog on M1 for any out-of-order GTIDs, which could cause problems at slave reconnect (seems unlikely though). - Replication filtering could cause this - double-check that no filtering was turned on or something. Also stuff like --gtid-ignore-duplicates. Good luck, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] HikariCP + searching archives
Stu Smithwrites: > I'm curious if there is a way to search the archives for this list - I One way is to use google with an additional search term: site:lists.launchpad.net/maria-discuss This restricts the search to the archives. The maria-developers@ list can be similarly searched by adding a search term: site:lists.launchpad.net/maria-developers Hope this helps, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Collations, trailing spaces and unique indexes
Binaruswrites: > "All MySQL collations are of type PADSPACE. This means that all CHAR, > VARCHAR, and TEXT values in MySQL are compared without regard to any > trailing spaces. “Comparison” in this context does not include the Yes, I have always found this terminally stupid as well. But I think it comes from the SQL standard. The only workaround I know of is to use VARBINARY instead of VARCHAR. I think it works much the same in most respects. But obviously some semantics is lost when the server no longer is aware of the character set used. > Since the index behaviour obviously depends on the collation, would > building an own collation which does not PADSPACE be an option? I have That would be interesting, actually. I don't know what support there is for non-PADSPACE collations. Maybe bar knows (Cc:'ed)? - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] The insert performance issue
林澤宇writes: > I use a file include 10 insert command statements to test the insert > performanc . > On local server ,the Mariadb spent about 20 second to insert data;but on > remote server ,the MariaDB spent about 30 second to insert data . > Why the MariaDB has about 10 second gap ? > Maybe the network should to cause some latancy ,but the time should not > have so long . Why do you think 10 seconds is unexpected? If you are sending 10 individual queries from a single client thread, that is 10 network roundtrips. Extra 10 seconds for that seems reasonable, in fact it sounds quite fast to me (0.1 ms network roundtrip). If you want better performance on remote server, try sending many statements in one roundtrip using multi-statement protocol, or running many queries in parallel from the client to overlap the roundtrips. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] replicate_rewrite_db as a system variable
Ian Gilfillanwrites: > Is there a reason replicate_rewrite_db is not available as a system > variable, while other similar settings, such as replicate_do_db are? I don't know of any specific reason, except that it is not implemented. Originally, none of these variables could be changed dynamically. Davi Arnaut (if I remember correctly) implemented making the filtering ones dynamic. replicate_rewrite_db was not part of that patch. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Enabling feedback pluging for MariaDB 10.1.4
Michael Widenius mo...@askmonty.org writes: for the alpha so I suggested Sergei today that we should enable it for the beta period of MariaDB 10.0 (10.*1* beta, I guess?) As most MariaDB users should know, the feedback is totally anonymous and no private or sensitive information is being sent. Any comments, suggestions or recommendations? I think it is a bad idea. Please do not do it. Phone-home is a misfeature in any product, and even more so in system software like a database. And besides, the information is much less useful than you think, because of unknown, but probably extreme, data skew. In fact, it will probably be more harmful than useful because people will use bad data to justify bad decisions. Experience supports this point of view with our download numbers. They do not include apt-get / yum / etc. installations, which judging from IRC conversations are the majority. Yet people continuely refer to them as though they mean anything, just because they are there. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] The relay-log is not fluashed after the slave-relay-log.999999 showed
Gmail next1...@gmail.com writes: And, as I mentioned at the title of this question, the relay-log is not flushed after the slave-relay-log.99 showed when using Salve_parallel_threads:10 setting. like showed blow. - binlog_format: ROW - Slave_parallel_threads:10 Everything are working fine except the slave-relay-log.** files continue to exist at the disk which will finally cause the disk full. If I change the value of Slave_parallel_threads setting from 10 to 0, the log will be flushed. Howevery PK duplicate warning error logs come next. Ok, thanks for reporting this. It's probably a bug that parallel replication behaves differently from non-parallel. I'll try to look into it when I have time. I could imagine that there are more bugs lurking when the log counter overflows... I'm not sure this is well tested. I wonder what the correct behaviour is? Should it just continue with slave-relay-log.100 ? Thanks, - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] New Question: Multi Master Replication
AskMonty KB nore...@askmonty.org writes: What I am having a problem with is if I add a new master. When I add the master the data in the slave table is truncated and only the data from the new master is replicated. I loose all my old data in the slave. The obvious guess is that the new master has DROP TABLE t1300,t13 in its binlog. If this is the problem, then it can be solved by for example changing the replication position to skip those events, or maybe by using replication filters. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Question about MariaDB Non-blocking Client
Mike Gibson notmikegib...@gmail.com writes: Greetings, Hi, sorry for taking a few days to get back to you: I'm using the MariaDB Non-blocking API to write a C++ client, but I've hit a wall regarding connection checking. I've been referencing the binding from node-mariasql ( https://github.com/mscdex/node-mariasql/blob/master/src/binding.cc). Ok, great that you can use the non-blocking API! The problem I'm experiencing is if I have a connection and kill the server (/etc/init.d/mysql stop) and then start it back (/etc/init.d/mysql start), I can never get clean reconnects. It's usually a mixture of errcode 2058, 2003, 2013. I'm really confused how to gracefully manage connections. Before I was using the official MySQL C++ connector, and it provides a connection-isConnected() method. I'm wondering how I can get something similar with MariaDB's client, as I need the non-blocking interface. I am not familiar with the MySQL C++ connector. I tried downloading the source for mysql-connector-c++-1.1.3.tar.gz, but it does not seem to have any isConnected() method? Anyway, the usual way to do this is to issue a mysql_ping() to the server, to check if the connection is working. With the non-blocking API, you would do this with mysql_ping_start() and mysql_ping_cont(). This will issue a request to the server to check if the connection is ok. If you have autoconnect enabled (MYSQL_OPT_RECONNECT), then this will automatically reconnect if the old connection was broken for some reason. (Note that you will need to be aware of the usual issues with automatic reconnect. For example, even if ping is successful, the connection may break immediately afterwards. And an autoconnect will loose any existing state on the connection such as temporary tables, SET @user_var, BEGIN, etc.) Given: MYSQL mysql, *mysqlHandle; Look at the mysql_real_connect_start() or cont() functions. I provide MYSQL struct and on connect I get a copy of the struct stored in *mysqlHandle. It's not clear to me what the purpose of the copy is at this point as I still use the initial struct to call query(). Agree, it is not very useful, it is just a copy of the pointer to your own structure. You only need it to check for error (in which case the pointer will be NULL). After that you can just you the pointer to your own struct that you already have. In the case that I can detect a disconnection, how do I properly clean up the connection and attempt reconnect? Do I mysql_close(mysql) and/or mysql_close(mysqlHandle), shutdown/close the file descriptor, mysql_init() a new handle and go through mysql_real_connect_start/cont()? I am not sure, but it seems to me from looking at the code that if you already got an error that the connection was closed, then you can just go through the mysql_real_connect_start/cont() sequence again to reconnect. But if that does not work, you can always mysql_close() and mysql_init() your struct again. There is no need to explicitly shutdown or close the file descriptor. Does it even make sense for each object to have its own MYSQL struct that I mysql_init(), or would it be better to have layer on top that mysql_init()s a single MYSQL struct, connects, and passes the returned *mysqlHandle to each query? This depends on your application. You can only have a single query in progress on one MYSQL struct at a time. So if you have multiple queries working at the same time on the server, you need one MYSQL struct for each. If you only have one query processing at a time, a single MYSQL struct will be sufficient. Thanks for providing the async client, any help is appreciated. I hope this helps, though it is of a somewhat generic nature. If you need more details, please ask again, perhaps with some example code that shows your problems. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] openbsd switched libmysqlclient back to older mysql one
Colin Charles co...@montyprogram.com writes: I noticed this: http://www.openbsd.org/cgi-bin/cvsweb/ports/databases/mariadb/Makefile?rev=1.4 Revert back to using MySQL 5.1 for the time being. MariaDB 5.5 introduces a new libmysqlclient non-blocking API which utilizes co-routines. The X86 specific GCC ASM co-routine support hid the fact that there was an issue. The only fallback code so far is POSIX user contexts which OpenBSD does not support. Is this something we can fix to ensure that OpenBSD users can continue to use our libmysqlclient? The patch below disables the non-blocking API (all the calls will return error) if no co-routine support is available. But you will have to find someone else to: - Implement the CMake check (HAVE_UCONTEXT) - Fix the test suite to not attempt to the use non-blocking API when it is not available - Test this on various platforms - Merge it up from 5.5 through 10.0 - Document it (I am totally occupied dealing with our mad feature race in replication and will not have time to clean this up). To actually support the non-blocking client API on a given platform requires co-routine support. The best option is asm stubs, but that needs to be done anew for each architecture. There appears to be a way using sigaltstack (MDEV-4601), but it's rather hackish. Another portable fallback possibility is pthreads, but that is even more hackish. All of them requires some work to integrate, though the meat of it should be possible to take from other software that have been fixed for *BSD (for example qemu). Only the code in my_context.c and my_context.h needs to be modified. - Kristian. === modified file 'client/mysqltest.cc' --- client/mysqltest.cc 2013-04-17 17:42:34 + +++ client/mysqltest.cc 2013-06-13 09:16:50 + @@ -5933,7 +5933,8 @@ void do_connect(struct st_command *comma mysql_options(con_slot-mysql, MYSQL_OPT_CONNECT_TIMEOUT, (void *) opt_connect_timeout); - mysql_options(con_slot-mysql, MYSQL_OPT_NONBLOCK, 0); + if (mysql_options(con_slot-mysql, MYSQL_OPT_NONBLOCK, 0)) +die(Failed to initialise non-blocking API); if (opt_compress || con_compress) mysql_options(con_slot-mysql, MYSQL_OPT_COMPRESS, NullS); mysql_options(con_slot-mysql, MYSQL_OPT_LOCAL_INFILE, 0); === modified file 'include/my_context.h' --- include/my_context.h2012-02-23 14:42:21 + +++ include/my_context.h2013-06-13 09:20:28 + @@ -31,8 +31,10 @@ #define MY_CONTEXT_USE_X86_64_GCC_ASM #elif defined(__GNUC__) __GNUC__ = 3 defined(__i386__) #define MY_CONTEXT_USE_I386_GCC_ASM -#else +#elif defined(HAVE_UCONTEXT) #define MY_CONTEXT_USE_UCONTEXT +#else +#define MY_CONTEXT_DISABLE #endif #ifdef MY_CONTEXT_USE_WIN32_FIBERS @@ -103,6 +105,13 @@ struct my_context { }; #endif + +#ifdef MY_CONTEXT_DISABLE +struct my_context { + int dummy; +}; +#endif + /* Initialize an asynchroneous context object. === modified file 'mysys/my_context.c' --- mysys/my_context.c 2012-11-04 21:20:04 + +++ mysys/my_context.c 2013-06-13 09:00:28 + @@ -726,3 +726,37 @@ my_context_continue(struct my_context *c } #endif /* MY_CONTEXT_USE_WIN32_FIBERS */ + +#ifdef MY_CONTEXT_DISABLE +int +my_context_continue(struct my_context *c) +{ + return -1; +} + + +int +my_context_spawn(struct my_context *c, void (*f)(void *), void *d) +{ + return -1; +} + + +int +my_context_yield(struct my_context *c) +{ + return -1; +} + +int +my_context_init(struct my_context *c, size_t stack_size) +{ + return -1; /* Out of memory */ +} + +void +my_context_destroy(struct my_context *c) +{ +} + +#endif ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] WL#4925 and multi-source replication
Greg skyg...@gmail.com writes: With kill -9 mysqld, and sync_binlog=0, I'm not really surprised since mysql will not fdatasync after each commit, right ? Right. So this is mostly just my own academic interest, in practice it is of course real crashes/powerfailures we want to handle, not SIGKILL. If you are interested, this is my thinking: the server always does a write(2) system call on the binlog at (group) commit, even with sync_binlog=0. So even if we SIGKILL the server, the data is still in the kernel buffers (at least on Linux), and will eventually reach disk. However, you are using DRBD. I am guessing that when mysqld on one node dies, a failover is done to the other node, and this looses data in the kernel disk buffers on the first node that have not been fdatasync()'ed. So I learned a bit about DRBD, thanks ;-) - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] WL#4925 and multi-source replication
Greg skyg...@gmail.com writes: I didn't think that sync_binlog=1 is required anymore for safe reliction. We are always using group commit in MariaDB 10.0 for the master, so the binary log will be synced for each group commit, which is safe. I have to use it in a DRBD config. When testing this config, I killed mysqld with SIGKILL and sync_binlog=0 and failover start binary log in old position, that caused duplicates on slaves. With sync_binlog=1, this happens no more. How to configure how often group commits are fdatasynced ? Correct, sync_binlog=1 is still required to ensure consistency between binlog and innodb. With sync_binlog=1, each group commit is fdatasynced (with a single binlog fdatasync per group), no further configuration is needed. You also need innodb_flush_log_at_trx_commit=1, of course. I am a bit surprised that you got duplicates with SIGKILL of mysqld. I would have expected crashing the OS kernel (ie. power failure) would be needed for fdatasync to make any difference? - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] testing Galera
Jan Kirchhoff j.kirchh...@logical-line.de writes: Is there something like slave_skip_counter, aka I Know what I do, skip that update? I think I have to take a new snapshot to get the second Yes, MariaDB has this: https://kb.askmonty.org/en/selectively-skipping-replication-of-binlog-events/ If you set skip_replication=0 (instead of SQL_LOG_BIN=0) and replicate-events-marked-for-skip=FILTER_ON_MASTER on all servers. Then the changes will be logged to the binlog (for Galera to use), but will not be sent to other slaves using traditional replication. Disclaimer: I do not have much experience with Galera, much less actually tried using it with @@skip_replication, but it should work, I think. Note that this is a MariaDB feature (it will not work with Galera based on MySQL or Percona-server). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] mariasql client question (zappajs, coffeescript, nodejs)
klrumpf klru...@gmail.com writes: I now want to switch to the mariasql client... Cool, that is an interesting project ... res.on(row, (row) - console.log Result row: + inspect(row) @get '/': (req,res) - # LANDING PAGE) result = doSql(select pakz from pkt where pktnr=10001) console.log 919, result = ,result Console log in the doSql functions shows the results correctly but I have been playing around for a while and I can't get the function to return the result for further processing to the @get routine. Seems like I have not understood I assume you mean that - You get output linis with Result row: ... - You get no line 919, result = , or it is empty. Unfortunately, there may not be many on this list who are familiar with node.js. I have only general knowledge about event-driven programming. This leads me to think that you need to pass a handler to your doSql() function and invoke this handler in your res.on() handler. But I frankly admit I do not know if something else could work with node.js. Thanks, hope this is the right list for this, Karl You are welcome to try :-). Maybe someone with knowledge of node.js will answer. But it might not be the best list, your problem looks more related to general node.js than to MariaDB. Hope this helps (a little), - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] MariaDB 5.6
Adam M. Dutko dutko.a...@gmail.com writes: Does the following item under the Performance section indicate optimizations for AMD chips are not being added? Better multi CPU performance above 16 cores (Work with Intel) No (very unlikely). The AMD and Intel x86_64 chips are quite similar, and most optimisations will help either, certainly optimisations aimed at scaling to many cores. The Work with Intel probably refers to the fact that Intel is helping the work by donating many-core servers to MariaDB developers, and maybe with advice/discussions. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] MariaDB 5.6
Peter Laursen peter_laur...@webyog.com writes: I noticed this on planet.mysql: http://kb.askmonty.org/v/plans-for-56 I *again* strongly want to discourage a major version number identical with a MySQL/Oracle release. MySQL plans a 5.6 too and I believe that there is already a source-tree available on launchpad. I think I understand that MariaDB 5.6 is planned to use MySQL 5.5 codebase. Am I correct? I don't think we would use 5.6 for MariaDB based on MySQL 5.5 (as opposed to MySQL 5.6) codebase. That would be confusing indeed. MariaDB 5.6 would be a version that included MySQL 5.6. If we need to release what we now call 5.6 before MySQL 5.6 is released/stable, I think we would need to come up something else... Similarly, the next release of MariaDB will be called 5.5 or 5.3, depending on whether we decide to include MySQL MySQL 5.5 or not. I have now posted this complaint 2 or 3 times (including the times I complained about the use of 5.2 for a mariaDB based on MySQL 5.1 code as there is alose a (now abandoned) MySQL 5.2 tree. I never had a reply from a MariaDB 'decisionmaker'. Could I at least request a reply this time? Please! :-) I also do not think current version schema is perfect. However, do you have a better suggestion? The good thing about MariaDB 5.1, 5.2, 5.3, 5.5, 5.6, ... is that given MariaDB X and MySQL Y, you can know that MariaDB X includes all of MySQL Y (and thus can be used as a drop-in replacement) as long as X = Y. Can we do something better that solves your concerns, and still preserves this nice property in some way? - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] FreeBSD BuildBot Slave - I can offer one
Jakob Lorberblatt ja...@native-intelligence.net writes: I noticed that you do not have a regular build of FreeBSD for either common architecture (x86_64 or i386) I would be able to provide that for you, I have equipment to spare and would be willing to lend this to the project for building assistance; in addition I am fairly familiar with FreeBSD compilation in general so I can also help with any issues that Thanks, sounds great! arise. I can begin configuring this host for the service, if I could have some direction towards what needs to be done in order for the MariaDB project at large to be able to make use of it. I believe I used to have some bookmarks of resources regarding the matter. However I don't have the references off the top of my head. Adam's advice should hopefully get you started. I will setup an account for your machine (64-bit to start with?) and send details in private mail. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] MariaDB and Sun CoolThread
Alexandre Almeida alexan...@hti.com.br writes: As far as I can see MariaDB stay locked/running on a single virtual CPU and MariaDB doesn't take advantage of CoolThread/CMT technologies, I mean it can not run on more than one virtual CPU same time. Result: poor performance. Anybody knows if there is something can I do to spread ;-) MariaDB all over cpus? Am I missing something? Each client connection will run on its own cpu. So to use all cpus your application needs to run many queries in parallel. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Glibc 2.3 support in MariaDB builds ?
Jocelyn Fournier j...@doctissimo.fr writes: I'm trying to push my company to adopt MariaDB 5.2. Unfortunately, the currently available binaries (at least for the x86-64 version) requires a glibc 2.4, where we have only a 2.3 version (debian etch) installed on our server. MySQL 5.5 standard binaries, as well as Percona's one perfectly support glibc 2.3, so is there any plan to provide glibc 2.3 compatible binaries for MariaDB ? We used to have Debian etch binaries, however I removed them at some point since Debian etch support ended half a year ago or so and there did not seem much interest. (If you want to help sponsor resurrecting binaries supporting such older systems you could try contacting sa...@askmonty.org.) Alternatively, you can build the binaries yourself, it is not hard. You basically need to run these two commands: BUILD/compile-bintar scripts/make_binary_distribution - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] SphinxSE missing plug.in
Brian Evans grkni...@scent-team.com writes: MariaDB 5.2.4 tarball is missing the SphinxSE plug.in file in storage/sphinx. I'm not sure if this was intentional or not, so I thought I might point it out. Oops :-( Thanks a lot for pointing out this serious issue! I filed a bug for it (https://bugs.launchpad.net/maria/+bug/691437) and will push a fix to be included in 5.2.5. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] my_atomic_add64
Oleg Tsarev zabiva...@gmail.com writes: I need my_atomic_add64 in mysql. Can i simple add following macros, or i need more advanced tricks? tsa...@main:/storage/project/percona/rtd_2$ diff -Nur ../rtd/c/include/my_atomic.h c/include/my_atomic.h --- ../rtd/c/include/my_atomic.h2010-07-09 16:35:11.0 +0400 +++ c/include/my_atomic.h 2010-08-17 18:57:07.648819066 +0400 I think you'll also need make_transparent_unions(32) #define U_32 int32 #define Uv_32 int32 @@ -96,25 +96,30 @@ make_atomic_cas( 8) make_atomic_cas(16) make_atomic_cas(32) +make_atomic_cas(64) make_atomic_cas(ptr) ... My guess is it should work, at least on 64-bit platforms. I'm not sure that 32-bit CPUs generally provide 64-bit atomic operations. If not (which seems likely, really), you'll need to come up with something to handle this case. Note that the my_atomic stuff has the possibility to fallback to mutex locking when support is not available, so one way might be to make it use this fallback on 32-bit (taking a performance penalty, but 32-bit is getting less and less interesting by the day anyway). - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp
Re: [Maria-discuss] Why isn't SO_SNDTIMEO used?
On Wed, Mar 10, 2010 at 12:29 PM, Michael Widenius mo...@askmonty.org wrote: We also use the thr_alarm() functionality when one uses 'kill connection-id' in MySQL. I don't know of any easy way to gracefully wake up a thread that is sleeping on SO_SNDTIMEO. Do you? Well, I checked the code, and it seems to wake up the thread using pthread_kill(thread, signal) for the 'kill connection-id' command. This should work fine also when using SO_SNDTIMEO for timeouts on the socket. Just send the signal to the thread blocking on the socket with SO_SNDTIMEO, and the blocking socket call will return with EAGAIN or similar. - Kristian. ___ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp