Re: [Maria-discuss] Backup on the replication server getting affected

2023-06-09 Thread Kristian Nielsen
ragul rangarajan  writes:

> Indeed the environment where we are able to see the issue is in *MariaDB
> 10.6.10 *and using pool-of-threads.

Cool, thanks ragul, that confirms that your issue is caused by the
MDEV-29843.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Backup on the replication server getting affected

2023-06-09 Thread Kristian Nielsen
ragul rangarajan  writes:

> Hope my issue is more related to the issue MDEV-30780 optimistic parallel
> slave hangs after hit an error
> Trying to reproduce with a minimal database.
>
> Attaching the gbd output

Thanks, that gdb output is really helpful!

I agree with Andrei that this rules out MDEV-30780 as the cause. Instead it
looks to be caused by MDEV-29843, see also MDEV-31427:

  https://jira.mariadb.org/browse/MDEV-29843
  https://jira.mariadb.org/browse/MDEV-31427

This is seen in the stack trace, where all the other worker threads are
waiting on one which is stuck inside pthread_cond_signal:

---
Thread 80 (Thread 0x7f47ad065700 (LWP 25417)):
#0  0x7f789dca054d in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7f789dc9e14d in pthread_cond_signal@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#2  0x55de401c23cd in inline_mysql_cond_signal (that=0x7f4798006b78) at 
/home/buildbot/buildbot/build/include/mysql/psi/mysql_thread.h:1099
#3  dec_pending_ops (state=, this=0x7f4798006b30) at 
/home/buildbot/buildbot/build/sql/sql_class.h:2535
#4  thd_decrement_pending_ops (thd=0x7f47980009b8) at 
/home/buildbot/buildbot/build/sql/sql_class.cc:5142
#5  0x55de407b5726 in group_commit_lock::release 
(this=this@entry=0x55de41f0da80 , num=num@entry=216757233923465)
at /home/buildbot/buildbot/build/storage/innobase/log/log0sync.cc:388
#6  0x55de407a0a3c in log_write_up_to (lsn=, 
lsn@entry=216757233923297, flush_to_disk=flush_to_disk@entry=false, 
rotate_key=rotate_key@entry=false, 
callback=, callback@entry=0x7f47ad064090) at 
/home/buildbot/buildbot/build/storage/innobase/log/log0log.cc:844
---

The pthread_cond_signal() function normally can never block, so this
indicates some corruption of the underlying condition object. This object is
used to asynchroneously complete a query on a client connection when using
the thread pool. The MDEV-29843 patch makes worker threads not use this
asynchroneous completion, which should eliminate this problem.

The stack trace strongly indicates MDEV-29843 as the cause. Except that
MDEV-29843 patch is supposed to be in MariaDB 10.6.11, and you wrote:

> Environment: MariaDB 10.6.11

Can you double-check if you are really seing this hang in 10.6.11, or if it
could have been 10.6.10 (the only version that is supposed to be vulnerable
to MDEV-29843)?

Another thing you can check is if you are using
--thread-handling=pool-of-threads, which I think is related to the
MDEV-29843 issue. In MDEV-31427 I suggest
--thread-handling=one-thread-per-connection as a possible work-around.

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Privilege Question

2023-04-06 Thread Kristian Nielsen
Scott Canaan  writes:

>   Thank you. I found SUPER, but was trying to avoid using it as it
> gives too many privileges. I was looking for something more
> fine-grained.

Maybe you can define a stored procedure with SQL SECURITY DEFINER (and a
DEFINER with the SUPER priviledge) that sets the desired syslog global
system variables. Then you can grant the ITS_READ account access to the
stored procedure, which will give access only to set the syslog
configuration.

Hope this helps,

 - Kristian.

> On Apr 06, Scott Canaan wrote:
>> We are on MariaDB 10.5.18.  There is a requirement to send all syslog 
>> data to a central syslog server.  In the past, we did it using a login 
>> called ITS_READ.  It has limited privs on purpose, but used to be able 
>> to execute the SET GLOBAL statements that we needed.  Those statements
>> are:
>> 
>> SET GLOBAL server_audit_output_type=SYSLOG; SET GLOBAL 
>> server_audit_logging=1; SET GLOBAL 
>> server_audit_syslog_facility=LOG_LOCAL2;
>> SET GLOBAL server_audit_events="connect,table,query_ddl,query_dcl";

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Code of Conduct

2022-12-22 Thread Kristian Nielsen
Andrew Hutchings  writes:

> Instead the intention is to discourage people from personal attacks at
> each other, which negatively affects the group as a whole.

It doesn't affect anyone at all expect those that choose to read them and
have nothing better to spend their time on than react to them and feed the
trolls.

The way to improve the mailing lists is to move the traffic from mariadb.com
/ mariadb.org internal mailing lists/private mail onto the public lists. Not
to put up systems to reduce the trafic even further (however low the value
of such traffic may currently be).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] MariaDB master-slave chained replication and parallelism

2021-08-06 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes:

> And the above transition can be explained by
> MDEV-24654 GTID event falsely marked transactional, its patch is under
> review.

Oh, yes, this bug sounds like it could result in what Jan described. It was
not clear to me from the bug description exactly under what conditions the
bug occurs, but if the first slave marks the replicated transactions as
"transactional" in its binlog, then the observed behaviour could occur.

The question then is how the chained slaves manage to run MyISAM
transactions in parallel without getting conflicts and hanging. One
possibility is that these are mostly insert-only queries (as Jan mentioned
in another mail), and I believe that MyISAM has the feature that MyISAM can
handle insert-only queries in parallel without locks and conflicts.

Would require a bit more research to be sure this is the explanation, but it
seems a possibility.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] MariaDB master-slave chained replication and parallelism

2021-08-05 Thread Kristian Nielsen
Jan Křístek  writes:

> We have a MariaDB 10.3 replication setup with one master and a few chained
> slaves (each has log_slave_updates switched on). Master uses mainly MyISAM
> tables, slaves have about 10 or 40 threads for parallel replication.
>
> Interesting is, that the first slave in the chain counts replicated
> statements into Non-Transactional Groups and the following ones count them
> into Transactional Groups.

Interesting. Where do you see these counts? My guess is that these are
counting the "transactional" status flag on each GTID event in the binlog.
You can see these yourself in a mysqlbinlog output from a binlog on the
master respectively the slaves.

#190606 19:42:35 server id 1  end_log_pos 514   GTID 0-1-2 trans

If these show non-transactional on the master but transactional on the first
slave, it sounds like you are replicating from MyISAM tables on the master
to InnoDB tables on the slave. Try SHOW CREATE TABLE t on a relevant table
on the master and the slave and see which storage engine they are using.

> Also, when checking process lists it seems that just one statement is being
> processed at the time (of the many threads) on the first slave, while there
> are multiple slave replication statements being executed on the 2nd and
> following slaves.

This observation matches the theory that the tables are MyISAM on the master
but InnoDB on the slaves. MariaDB parallel replication has limited
capabilities in parallelising MyISAM changes. The main algorithms are based
on optimistic apply, where transactions are run in parallel by default, and
any conflicts are handled by rollback and retry. This is possible in InnoDB
but not MyISAM. And the transactional status is checked on the table engine
used on the master, not the slave.

Thus, the first slave sees MyISAM changes, and does not do parallel
operation, but writes InnoDB transactions. These InnoDB transactions are
then seen by following slaves which enables the parallel replication
algorithms.

> Please, does anyone know the reason why the replicated statements are
> counted into different groups? Or, more importantly, how to increase the
> parallelism on the first slave in the chain?

The obvious answer is to change the tables to be InnoDB on the master. Which
may or may not be possible in your setup.

A possibly crazy/theoretical idea would be to setup the first slave with the
blackhole engine for all tables. This requires statement-based replication
and doesn't store _any_ data on the slave, just passes statements through to
the next slave in line. There's an old idea to use the blackhole engine in
this way as a "replication relay", and IIRC the blackhole engine is
transactional. Not sure if this would actually work though, would require
careful testing and is definitely not a supported configuration, I would say
(but fun to think about).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] GTID and missing domain

2021-02-27 Thread Kristian Nielsen
mari...@biblestuph.com writes:

> And my primary server has:
>
> gtid_binlog_pos  0-303-67739600,1-303-7363061243,100-303-4338582 
>
> gtid_binlog_state
> 0-302-67690294,0-301-67719794,0-303-67739600,1-301-7350472534,1-302-7350381758,1-303-7363061243,100-302-4242958,100-301-4332195,100-303-4338582
>  

> set global gtid_slave_pos = '1-303-7360639083,100-303-4337869';
> start slave;

> Got fatal error 1236 from master when reading data from binary log:
> 'Could not find GTID state requested by slave in any binlog files.
> Probably the slave state is too old and required binlog files have
> been purged.
>
> Even though I'm positive there are no domain 0 transactions (again,
> hasn't been in service for years).

Yes.

You write that "there are no domain 0 transactions". But from the point of
view of the database, there _are_ domain 0 transactions, even though they
may be long in the past. These are seen in gtid_binlog_pos (and
gtid_binlog_state).

When your slave has the 0-domain in the gtid_slave_pos, the master knows
that the slave is missing no transactions. When you delete the 0-domain from
the slave, this is the same conceptually as saying the slave is missing
_all_ transactions in domain 0, and the master must send them all (or error
out if they have been purged, as here).

In general, when a slave connects, the master needs to send all transaction
in a domain that the slave did not apply yet - otherwise the slave will be
missing transactions and have the wrong data. This holds regardless of how
old those missing transactions might be. If a slave connects two years after
last being active, the system should still give a reasonable error, not
silently let the slave continue with incorrect data.

That is why you get the error.

> if I:
>
> FLUSH BINARY LOGS DELETE_DOMAIN_ID=(0)
>
> on the master, would I then be able to connect to it via
>
> set global gtid_slave_pos = '1-303-7360639083,100-303-4337869';

Yes.

With this command, we are re-defining the history of the master to say that
there were never any transactions in domain 0. Therefore, any slave that
connects cannot be missing any such transactions.

Hope this helps,

 - Kristian.
 

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] fsync alternative

2019-09-13 Thread Kristian Nielsen
Marko Mäkelä  writes:

> There is some ongoing development work around this area. If the binlog
> is enabled, it should actually be unnecessary to persist the storage
> engine log, because it should be possible to replay any
> not-committed-in-engine transactions from the binlog. We must merely

Nice to hear that this is being worked on. There is an old worklog MWL#164
with some analysis of potential issues to be solved.

  http://worklog.askmonty.org/worklog/Server-RawIdeaBin/?tid=164

It becomes tricky in some corner cases, for example cross-engine
transactions where one engine has the changes persisted after a crash and
the other does not.

But the impact of a robust implementation of this could be huge,
double-fsync-per-commit is _really_ expensive. Hopefully the corner cases
can be solved or handled with some kind of fall-back.

> But, InnoDB’s use of fsync() on data files feels like an overkill. I
> believe that we only need some 'write barriers', that is, some

This is also quite interesting. My (admittedly limited) understanding is
that disks in fact have write-barrier functionality, and that journalling
file systems in fact use that. The problem seems to be how to expose that to
userspace. I wonder if there are any existing or proposed interfaces to
allow userspace to specify write barriers between writes.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] fsync alternative

2019-09-07 Thread Kristian Nielsen
Jure Sah  writes:

> It would appear that on a typical webserver, the majority of disk i/o
> belongs to the MariaDB database process. It would appear it's mostly
> waiting on fsync(), which to my understanding is executed at least once
> per commit.

Group commit can amortise this if there are multiple commits in parallel,
but essentially yes.

> I've also noticed in the documentation that the options to control fsync
> usage are even more limited than in the MySQL server. They are also very
> strongly argued against. Considering the point that InnoDB is considered
> to be in an inconsistent state in any event, so long as the server is
> not cleanly stopped, is there really justification for such strong
> opposition here?

Usually you can just set --sync-binlog=0 --innodb-flush-log-at-trx-commit=2
and most fsync should be gone.

> I understand that this is extensively researched in the documentation
> and it has to do with the recovery of data in case of an unexpected
> server reboot.

InnoDB should recover safely in any case. But if binlog is enabled, the
binlog is likely to become out of sync with the data inside innodb (in case
of kernel crash/power outage. Mariadb process crash by itself is not enough
to break recovery). So if the server is a replication master, slaves might
need to be rebuilt.

Whatever is argued in one place or another, the better approach is to
read docs on what each option actually does, and make your own trade-off, in
this case between performance and recoverability. Which is exactly what you
did, concluding that running without fsync is the right choice in your
setup.

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid and current_pos vs slave_pos

2019-04-23 Thread Kristian Nielsen
mari...@biblestuph.com writes:

> I have four servers all running 10.3 as follows:
>
>A <=> B => C => D

> and C is a master to D. In addition to their actual replicating DBs,
> all four servers also have a "norep" DB that is used to create
> temporary tables for local report processing (as well as any other
> possible writes we might want to make locally without affecting the
> slave chain). Historically we've prevented replication for the norep
> DB via:
>
> replicate_ignore_db = mysql,norep
> replicate_wild_ignore_table = mysql.%,norep.%

> binlog_do/ignore flavor of filters are all unset). Writes to the A and
> B servers are programmatically controlled such that only one of the
> two servers will accept writes at any given moment.

> Specifically, when I look at the gtid_slave_pos on server D, which I
> thought was only supposed to reflect transactions that were actually
> replicated, I sometimes see statements coming from server C; these are
> temporary tables being written into norep on C. They are not actually
> replicating on D (at least as far as I can tell), and they don't show up
> D's binary log. So why would they be reflected in D's gtid_slave_pos?

The gtid_slave_pos on D is the current position _within the binlog of C_ (C
being the master of D). The filtering you set up happens on the slave side
D, not on the master side C. So even the norep transactions on C are still
"replicated" in the sense that they are sent to D and processed (including
updating the gtid_slave_pos value). The filtering just causes skipping the
actual changes to tables or data. If D happens to disconnect from C at the
point of a "norep" transaction, it will need to restart from that position
when it reconnects later.

> For example, just a moment ago SELECT @@GLOBAL.gtid_slave_pos on D
> showed this:
>
> 1-303-48758339
>
> This transaction does not appear in D's binlog, which I would expect
> since it should not in fact actually be replicated. But because it is
> reflected in gtid_slave_pos, it seems to me that in my setup I cannot
> reliably use gtid_current_pos or gtid_slave_pos, since either may at
> any given time point to an entry on C that of course won't exist on B
> should I ever want to redirect D to B.

Yes. Using replicate_ignore_db is not appropriate for doing local changes on
one server that should be invisible to the replication chain. So this will
not work, as you suspect.

The simplest way is to just set sql_log_bin=0 when doing local transactions
on a slave - this avoids the statements being written to the binlog in the
first place. No replicate_ignore_db options are needed then.

It's possible you can achieve something similar using binlog_ignore_db
instead (I don't 100% recall all details, but from the documentation it
looks like it might work).

Your current setup is effectively multi-master from the point of view of
GTID (all servers written concurrently), even though you then
replicate_ignore_db changes from all but one server. As described in the
documentation, GTID can handle multi-master setups using gtid_domain_id, but
I think that is much more complicated than needed for your actual usecase.
Just using sql_log_bin=0 (or possibly binlog_ignore_db) should be fine.

> DROP TEMPORARY TABLE IF EXISTS `norep`.`locations` /* generated by server */
> /*!*/;
>
> How is it that that statement made it all the way through to server D
> from B? Shouldn't it have been filtered out by server C?

I vaguely recall an old bug that causes in particular redundant DROP
TEMPORARY TABLES statement to be unnecessarily written to the binlog. Maybe
this bug is still there and causing this.

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Post-MySQL(5.6) to MariaDB migration question - why are master_info_repository=TABLE and relay_log_info_repository=TABLE not supported?

2018-12-25 Thread Kristian Nielsen
Artem Russakovskii  writes:

> Thank you for the explanation. Helpful. I'm guessing once all slaves and
> then the master are converted to mariadb, global transaction IDs are going
> to start getting used (or maybe I'll need to tweak some variables). Because
> right now it's empty on the one slave I converted.

The first time the mariadb slave connects to the migrated mariadb master,
the slave obtains the GTID corresponding to the current position. Then you
can switch the slave to use GTID for future connections:

  CHANGE MASTER TO master_use_gtid=slave_pos

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Post-MySQL(5.6) to MariaDB migration question - why are master_info_repository=TABLE and relay_log_info_repository=TABLE not supported?

2018-12-24 Thread Kristian Nielsen
Artem Russakovskii  writes:

> Upon further analysis, it turned out to be the lack of support of
> master_info_repository=TABLE and relay_log_info_repository=TABLE in
> mariadb, which means the master information effectively disappeared as far
> as the slave server is concerned.

> the values fished out from the slave_master_info table), it also seems to
> be a step back when it comes to crash-safe replication.
>
> Does anyone have an explanation for why we're now back to master.info and
> relay-log.info on disk rather than nice tables in memory?

In MariaDB, the replication position is stored crash-safe in a table
(mysql.gtid_slave_pos) when using MariaDB global transaction ID.

One problem with the way the MySQL relay_log_info_repository=TABLE feature
is designed is that it makes it impossible for two transactions to update
their position simultaneously. Thus it doesn't work well with parallel
replication. That's one reason it is implemented differently in MariaDB.

I agree it is unfortunate that this breaks mysql->mariadb migrations.

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-10-13 Thread Kristian Nielsen
Hi Reinis,

I have now pushed a fix for this. I expect it will be included in the next
release.

Once again thanks for taking the time to do a good error report, glad to get
this fixed.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-10-08 Thread Kristian Nielsen
>> In any case, thanks Reinis for taking the time to report this serious issue, 
>> I'll see
>> if I can come up with a patch to fix the problem.
>
> Thx and looking forward to it. 

I have now committed a patch that should fix this.

If you want to try it, you can find it here:

  
https://github.com/knielsen/server/commit/3eb2c46644b6ac81e7e5e79c9c120700a48d8071

Or else this will hopefully make it into a coming 10.3 release, I've asked
Andrei Elkin to review it.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-10-06 Thread Kristian Nielsen
"Reinis Rozitis"  writes:

> Is there a jira/github issue I could follow?

I can put any updates in the MDEV-12147.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-10-02 Thread Kristian Nielsen
Kristian Nielsen  writes:

> (Hm. Actually... if a conflict is detected _after_ the transaction has
> deleted old rows from the mysql.gtid_slave_pos table, then the deletions
> will be rolled back along with the conflicting transaction, and it seems we
> will get old rows left-over just as you see... if that is what is happening

After some tests, it seems this is indeed what is happening. Whenever a
conflict in optimistic parallel replication is detected late in the
execution of the conflicting transaction, rows in the mysql.gtid_slave_pos
table can be left undeleted, as you see.

This goes back all the way to 10.1. I'm somewhat sad to see a bug like this
surface only now, it would appear that optimistic parallel replication is
not much used? Or maybe the fact that the table will be cleared on server
restart has made people just live with it?

In any case, thanks Reinis for taking the time to report this serious issue,
I'll see if I can come up with a patch to fix the problem.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-09-29 Thread Kristian Nielsen
Kristian Nielsen  writes:

> It is cool that optimistic replication works well in your setup to avoid
> slave lag (but not cool that it causes this problem). I will try to see if I
> can reproduce with simple test-generated traffic. But if you know of a way I
> could reproduce that would be useful.

I seem to be able to reproduce easily with just the standard testcase in the
source tree. Ie. I added a select from mysql.gtid_slave_pos to
tokudb_rpl.rpl_parallel_optimistic and I see extra rows at the end of the
test:

  select * from mysql.gtid_slave_pos order by domain_id, sub_id;
  domain_id   sub_id  server_id   seq_no
  0   8   1   8
  0   11  1   11
  0   12  1   12
  0   70  1   70
  0   71  1   71
  0   73  1   73
  0   126 1   126
  0   127 1   127

But adding debug printout, I can see the rows being deleted:

  delete -1-8 sub=8
committing...

So somehow the delete is getting lost afterwards, I'll try to dig a bit
deeper. But I should have the info from you I need for now, thanks for
reporting this.

If you want a work-around for now, then it should be ok to periodically
delete (eg. a cron job) all rows in mysql.gtid_slave_pos except the one with
the highest sub_id within each domain_id.

(Hm. Actually... if a conflict is detected _after_ the transaction has
deleted old rows from the mysql.gtid_slave_pos table, then the deletions
will be rolled back along with the conflicting transaction, and it seems we
will get old rows left-over just as you see... if that is what is happening
here, then that seems quite a serious bug, and I wonder how it has been able
to go undetected for so long... or maybe something else is going on).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] gtid_slave_pos row count

2018-09-29 Thread Kristian Nielsen
Thanks for the additional info.

TokuDB probably isn't the most used engine with optimistic parallel
replication. I worked with a TokuDB developer some time ago to make it work,
but I'm not sure how well those fixes are maintained in MariaDB (ie. I tried
now running the testcase tokudb_rpl.rpl_parallel_optimistic on 10.3.9, it
fails, seemingly because it hasn't even been run after some changes to the
test framework).

However, I do not see why TokuDB would cause the behaviour you describe, the
maintenance of mysql.gtid_slave_pos is not storage-engine dependent and you
tested with the table using InnoDB.

It would be useful to see if this problem is also present with earlier
versions of MariaDB on the slave.

The GTID position is always maintained by MariaDB, that is why you see this
despite not actively using GTID. I also do not see how 10.2.4 master could
influence the problem.

The table data you sent show rows that are seemingly randomly selected from
transactions, and it only occurs in optimistic mode. That suggests the
problem is related to when a transaction is optimistically run in parallel
and causes a conflict and needs to be rolled back. Maybe this rollback
becomes ineffective in mysql.gtid_slave_pos for some reason? Though again I
don't immediately see how this could be.

It is cool that optimistic replication works well in your setup to avoid
slave lag (but not cool that it causes this problem). I will try to see if I
can reproduce with simple test-generated traffic. But if you know of a way I
could reproduce that would be useful.

 - Kristian.

"Reinis Rozitis"  writes:

>> Do you have any errors in the error log about failure to delete rows?
>
> Nope, no errors.
>
>
>> Anything else special to your setup that might be causing this?
>
> At some point I thought maybe the tokudb_analyze_in_background / 
> tokudb_auto_analyze messes things up as it does the background check (you can 
> also see here the row count growing):
>
> 2018-09-29 11:05:48 134488 [Note] TokuDB: Auto scheduling background analysis 
> for ./mysql/gtid_slave_pos_TokuDB, delta_activity 423840 is greater than 40 
> percent of 1059601 rows. - succeeded.
> 2018-09-29 11:09:35 134490 [Note] TokuDB: Auto scheduling background analysis 
> for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424359 is greater than 40 
> percent of 1060885 rows. - succeeded.
> 2018-09-29 11:13:23 134488 [Note] TokuDB: Auto scheduling background analysis 
> for ./mysql/gtid_slave_pos_TokuDB, delta_activity 424888 is greater than 40 
> percent of 1062196 rows. - succeeded.
>
> (it triggers also in conservative mode but then it happens just because of a 
> single row being >40% of the table)
>
> I tried to switch off the gtid_pos_auto_engines to use a single gtid_pos 
> InnoDB table and it makes no difference - in conservative mode everything is 
> fine in optimistic the table fills up.
>
>
>
> The odd thing is that I'm actually not using gtid for the replication:
>
> MariaDB [mysql]> show slave status\G
>
> Slave_IO_State: Waiting for master to send event
>Master_Host: 10.0.8.211
>Master_User: repl
>Master_Port: 3306
>  Connect_Retry: 60
>Master_Log_File: mysql-bin.096519
>Read_Master_Log_Pos: 79697585
> Relay_Log_File: db-relay-bin.000142
>  Relay_Log_Pos: 78464847
>  Relay_Master_Log_File: mysql-bin.096519
>   Slave_IO_Running: Yes
>  Slave_SQL_Running: Yes
>Replicate_Do_DB:
>Replicate_Ignore_DB:
> Replicate_Do_Table:
> Replicate_Ignore_Table:
>Replicate_Wild_Do_Table:
>Replicate_Wild_Ignore_Table:
> Last_Errno: 0
> Last_Error:
>   Skip_Counter: 0
>Exec_Master_Log_Pos: 79697245
>Relay_Log_Space: 595992008
>Until_Condition: None
> Until_Log_File:
>  Until_Log_Pos: 0
> ..
>  Seconds_Behind_Master: 0
>  Master_SSL_Verify_Server_Cert: No
>  Last_IO_Errno: 0
>  Last_IO_Error:
> Last_SQL_Errno: 0
> Last_SQL_Error:
>Replicate_Ignore_Server_Ids:
>   Master_Server_Id: 211
> Master_SSL_Crl:
> Master_SSL_Crlpath:
> Using_Gtid: No
>Gtid_IO_Pos:
>Replicate_Do_Domain_Ids:
>Replicate_Ignore_Domain_Ids:
>  Parallel_Mode: optimistic
>  SQL_Delay: 0
>SQL_Remaining_Delay: NULL
>Slave_SQL_Running_State: Slave has read all relay log; waiting for the 
> slave I/O thread to update it
>   Slave_DDL_Groups: 25
> Slave_Non_Transactional_Groups: 284193
> Slave_Transactional_Groups: 452098720
>
>
>
> The other "special" thing maybe is that the master is still 10.2.4 - but that 

Re: [Maria-discuss] gtid_slave_pos row count

2018-09-28 Thread Kristian Nielsen
"Reinis Rozitis"  writes:

> the table starts to grow continuously:
>
> MariaDB [mysql]> select count(*) from gtid_slave_pos;
> +--+
> | count(*) |
> +--+
> |  5577268 |
> +--+
> 1 row in set (1.553 sec)

That definitely look bad. As you say, there can be multiple rows in the
table, but it should be the same order of magnitude as
@@slave_parallel_threads, not millions.

Do you have any errors in the error log about failure to delete rows?
Anything else special to your setup that might be causing this?
Can you share the contents of the mysql.gtid_slave_pos table when this
happens?

> Is there something wrong with the purger? 
> (something similar like in https://jira.mariadb.org/browse/MDEV-12147  ? )

That bug is rather different - the row count in the table is not growing,
but number of unpurged rows is.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Replication Problem

2018-07-05 Thread Kristian Nielsen
Thomas Plant  writes:

> Your tip with 'MASTER_USE_GTID=slave_pos' fixed it.Will have to look
> better in the documentation next time, never found the 'slave_pos'
> mentioned or 'SET sql_log_bin=0'.
>
> Thank you very much for your help.

Welcome, glad that you solved your problem.

The slave_pos/current_pos is confusing a lot of users. It would probably
have been better if current_pos had never been introduced.

> so you mean that disabling the binary log on the slave would be
> indicated? Can I do it while it is online?

SET sql_log_bin=0 disables binlogging only for the following queries done in
that connection (not in general). So yes, it can be done online. You would
do it for queries that you will not want replicated to other servers. For
example, if you later make this slave the master, and put the old master to
replicate from the old slave (now new master), you probably do not want to
replicate your earlier slave-fixup-query. Hence the suggestion to SET
sql_log_bin=0 to avoid having this query in the slave binlog.

But if using MASTER_USE_GTID=slave_pos, in most cases it won't matter one
way or the other.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Replication Problem

2018-07-04 Thread Kristian Nielsen
Thomas Plant  writes:

> Today I had time to look at the error, removed the duplicate ID from
> the table and started the slave thread again using 'start slave;'.
>
> But now I get another error:
>
> Last_IO_Error: Got fatal error 1236 from master when reading data from
> binary log: 'Error: connecting slave requested to start from GTID
> 0-2-2948175468, which is not in the master's binlog. Since the
> master's binlog contains GTIDs with higher sequence numbers, it
> probably means that the slave has diverged due to executing extra
> erroneous transactions'

So it seems you are using MariaDB Global Transaction ID with
MASTER_USE_GTID=current_pos, and you forgot to do the duplicate ID removal
under `SET sql_log_bin=0`.

The easiest solution is probably to CHANGE MASTER TO
MASTER_USE_GTID=slave_pos. This should make the slave ignore the local
transaction and just connect to the master using the last replicated
position.

(current_pos tells the MariaDB server that you expect any local transactions
on the slave to also be replicated to other servers, hence the error.
current_pos is appropriate for an earlier master that is turned into a
slave, but not for a slave where local "fixup" transactions ended up in the
binlog).

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Replication New to Old

2018-04-05 Thread Kristian Nielsen
andrei.el...@pp.inet.fi writes:

> Mike,
>
>> Hello,
>>
>> I realize that in general replication from a newer master to an older
>> slave is typically not recommended.  This said, does anyone have an
>> experience replicating from MariaDB 10.2 to MySQL 5.6?
>
> A problem that is evident at once is 10.2 GTID events can not be handled
> by 5.6. So at least some filtering should be devised.

10.2 does not send GTID to a slave that does not understand it (they are
rewritten on-the-fly to BEGIN query events). So GTID events should not cause
5.6 slave to break.

More generally, the code in MariaDB (at least the code that I wrote) detects
what capabilities the slave has, and avoids sending stuff from the master
that an old slave will not understand. See MARIA_SLAVE_CAPABILITY_* in
log_event.h.

So the intention is that replication to old slave should work. However, this
still requires that applications restrict themselves from using any SQL not
supported on the old slave. And it is only poorly tested, if at all. Hence
the recommendation to avoid new master->old slave.

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Duplication of documentation for binlog annotation and a question on replicate_annotate_row_events not being dynamic

2017-07-03 Thread Kristian Nielsen
"Jean-Francois B. Gagne"  writes:

> And the technical question about replicate_annotate_row_events: this
> variable is not dynamic, is there a reason for that ?  I understand that
> this variable could/should only be modifiable while the slave is stopped,
> but not being dynamic is not very DBA/SysAdmin/Operator friendly.

It looks like there is no reason for it.
The only place the variable is used is during slave connect to master (in
request_dump()). So the variable could even be completely dynamic (no need
to have slave stopped), though it will only take effect after slave IO
thread reconnect.

I suspect that this one-liner would work just fine to make the variable dynamic:

diff --git a/sql/sys_vars.cc b/sql/sys_vars.cc
index de054f3..641d7a5 100644
--- a/sql/sys_vars.cc
+++ b/sql/sys_vars.cc
@@ -5316,7 +5316,7 @@ static Sys_var_mybool Sys_replicate_annotate_row_events(
"replicate_annotate_row_events",
"Tells the slave to write annotate rows events received from the master 
"
"to its own binary log. Ignored if log_slave_updates is not set",
-   READ_ONLY GLOBAL_VAR(opt_replicate_annotate_row_events),
+   GLOBAL_VAR(opt_replicate_annotate_row_events),
CMD_LINE(OPT_ARG), DEFAULT(TRUE));
 #endif
 
 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Fix different gtid-positions on domain 0 in multi-master

2017-03-17 Thread Kristian Nielsen
Reinder Cuperus  writes:

> The problem is, as soon as I stop that connection, that master2 and
> master3 have different gtid-positions for domain0, and stop/start on
> replication master3->backup results in the error:
> "Got fatal error 1236 from master when reading data from binary log:
> 'Error: connecting slave requested to start from GTID 0-1-3898746614,
> which is not in the master's binlog'

Yes. backup sees that it is ahead of master3 in domain 0, so it aborts to
avoid risk of diverging replication.

> I have tried moving master1/2 to domain_id:1, and removing the
> domain_id:0 from the gtid_slave_pos on backup, but starting the
> replication master2->backup results in the error:
> Got fatal error 1236 from master when reading data from binary log:
> 'Could not find GTID state requested by slave in any binlog files.
> Probably the slave state is too old and required binlog files have been
> purged.'

Yes. Because backup now sees that it is far behind in domain 0 (as it sees
the world), and aborts to not silently lose transactions.

> I tried finding a way to purge domain:0 from master3/master4, but the
> only way sofar I have found is doing a "RESET MASTER" on master3, which
> would break replication between master3 and master4.

Yes, I guess this is what you need. You have made a copy and removed half of
the data, and now you need to similarly remove half of the binlog. Even if
there are no actual transactions left from a domain in non-purged binlogs,
the binlogs still remember the history of all domains, in order to not
silently lose transactions for a slave that gets far behind.

It would be useful in general to be able to purge a domain from a binlog.
But currently the only way I can think of is RESET MASTER.

You can see how this binlog history looks by checking @@gtid_binlog_state,
and in the GTID_LIST events at the head of each binlog file.

> I have tried to find a way to insert an empty transaction, with the last
> gtid on domain_id:0 on the master3, to bring master2/master3 in sync
> again on that domain, but I could not find a way to do that on MariaDB.

The server will not binlog an empty transaction, but a dummy transaction
should work, eg. create and drop a dummy table, for example:

  CREATE TABLE dummy_table (a INT PRIMARY KEY);
  SET gtid_domain_id= 0;
  SET gtid_server_id= 1;
  SET gtid_seq_no= 3898746614;
  DROP TABLE dummy_table;

Maybe this way you can make the binlogs look like they are in sync to the
replication, not sure. It might be tricky, but then you do seem to have a
good grasp of the various issues involved.

> Are there other ways to fix this issue, so I can have reliable
> replication master3->backup without having to keep the dummy replication
> backup->master3 indefinitely?

I guess you would need to stop traffic to master3/master4 while getting them
in sync with one another and the do RESET MASTER on both and SET GLOBAL
gtid_slave_pos="" to start replication from scratch. You would then also
need to have server 'backup' up-to-date with master3 before RESET MASTER,
and remove domain id 20 from the gtid_slave_pos on backup after the RESET
MASTER.

So that is quite intrusive.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Reg replication and commit

2016-10-14 Thread Kristian Nielsen
Karthick Subramanian  writes:

> Below when I try at Slave DB:
>
> MariaDB [dr_repl]> select * from test_dr_repl;
> Empty set (0.00 sec)
>
> MariaDB [dr_repl]> commit;
> Query OK, 0 rows affected (0.00 sec)
>
> MariaDB [dr_repl]> select * from test_dr_repl;
> ++--+
> | id | val  |
> ++--+
> |  1 |1 |
> |  2 |2 |
> |  3 |3 |
> ++--+
> 3 rows in set (0.00 sec)

I wasn't really able to fully understand your explanation of your problem.

However, the above suggests you have an open transaction with isolation
level REPEATABLE READ. This is the only situation I can think of where a
COMMIT will affect the visibility of other rows. When you open a transaction
with REPEATABLE READ (with BEGIN, or with autocommit off), no new changes
will be visible until COMMIT or ROLLBACK. This is a basic feature of InnoDB
transactions, independent of replication.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Semi-sync replication hangs when changing binlog filename.

2016-08-16 Thread Kristian Nielsen
Joseph Glanville  writes:

> This fixes the problem for me. How do we go about getting this into a release?

I can push it. It should go into 10.1, I think (this code is not in 10.0).

Thanks for testing!

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Semi-sync replication hangs when changing binlog filename.

2016-08-15 Thread Kristian Nielsen
Pavel Ivanov  writes:

> binlog ending at position mariadb-bin.04:2039896, somehow the
> function ReplSemiSyncMaster::commitTrx() gets trx_wait_binlog_name =
> 'mariadb-bin.05' and trx_wait_binlog_pos = 2039896. I.e. the
> function gets the position of the transaction to wait semi-sync ack
> for correctly, but the file name is already the one that is current
> after rotation. Master starts waiting for that position, but the slave

> Kristian, do you have any idea what's going on? Is there an
> inappropriate lock release/re-acquire somewhere?

Hm. Actually, looking into MYSQL_BIN_LOG::trx_group_commit_leader, this
looks suspicious:

RUN_HOOK(binlog_storage, after_flush,
(current->thd,
 current->cache_mngr->last_commit_pos_file,
 current->cache_mngr->last_commit_pos_offset, synced,
 first, last))

But

RUN_HOOK(binlog_storage, after_sync,
 (current->thd, log_file_name,
  current->cache_mngr->last_commit_pos_offset,
  first, last))

I would have expected that `log_file_name' to be also
current->cache_mngr->last_commit_pos_file, like in the first instance. And
in fact, it looks like (with my limited knowledge of semi-sync) that this
suspicious case is exactly the AFTER_SYNC which fails, while AFTER_COMMIT
works...

So maybe try the below patch?

Pavel, what do you think, do you agree that this patch should be better?

 - Kristian.

diff --git a/sql/log.cc b/sql/log.cc
index 7efec98..b77a6b3 100644
--- a/sql/log.cc
+++ b/sql/log.cc
@@ -7712,7 +7712,7 @@ MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
   last= current->next == NULL;
   if (!current->error &&
   RUN_HOOK(binlog_storage, after_sync,
-   (current->thd, log_file_name,
+   (current->thd, current->cache_mngr->last_commit_pos_file,
 current->cache_mngr->last_commit_pos_offset,
 first, last)))
   {
___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-08-10 Thread Kristian Nielsen
Rich,

Cool, thanks for the pointers, that looks very helpful. I'll try to see if I
can come up with something.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-08-10 Thread Kristian Nielsen
Kristian Nielsen <kniel...@knielsen-hq.org> writes:

> Rich Prohaska <prohas...@gmail.com> writes:
>
>> Is TokuDB supposed to call the thd report wait for API just prior to a
>> thread about to wait on a tokudb lock?
>
> If I wanted to look into implementing this, do you have a quick pointer to
> where in the TokuDB code I could start looking? Like the place where lock
> waits are done? (I have not worked with the TokuDB source before, though I

I took just a quick look at the code, in particular lock_request.cc:

  int lock_request::start(void) {
  txnid_set conflicts;
  
  r = m_lt->acquire_write_lock(m_txnid, m_left_key, m_right_key, 
, m_big_txn);
  if (r == DB_LOCK_NOTGRANTED) {

It seems to me that at this point in the code, what is required is to call
thd_report_wait_for() on each element in the set conflicts, and that should
be about it.

Some mechanism will be needed to get from TXNID to THD, of course. A more
subtle problem is how to ensure that those THDs cannot go away while
iterating? I'm not familiar with what kind of inter-thread locking is used
around TokuDB row locks.

But it looks like a proof-of-concept patch for TokuDB optimistic parallel
replication might be fairly simple to do.

I also noticed that TokuDB does not support handlerton->kill_query() (so
KILL cannot break a TokuDB row lock wait). That should be fine, the KILL
will be handled when the wait finishes (or if _all_ transactions are waiting
on the row locks of each other, then a normal TokuDB deadlock detection will
handle things).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-08-10 Thread Kristian Nielsen
Rich Prohaska  writes:

> Is TokuDB supposed to call the thd report wait for API just prior to a
> thread about to wait on a tokudb lock?

If I wanted to look into implementing this, do you have a quick pointer to
where in the TokuDB code I could start looking? Like the place where lock
waits are done? (I have not worked with the TokuDB source before, though I
am somewhat familiar with the concept of how it works.)

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-08-09 Thread Kristian Nielsen
Rich Prohaska  writes:

> Is TokuDB supposed to call the thd report wait for API just prior to a
> thread about to wait on a tokudb lock?

Yes, that's basically it.

Optimistic parallel replication runs transactions in parallel, but enforces
that they commit in the original order. So suppose we have transactions T1
followed by T2 in the replication stream, and that they try to update the
same row. When T2 gets ready to commit, it needs to wait for T1 to commit
first (this is what you see in wait_for_prior_commit()). However, if T1 is
waiting on a row lock held by T2, we have a deadlock.

thd_report_wait_for() checks for this condition. If a transaction goes to
wait on a lock held by a later (in terms of in-order replication)
transaction, the later transaction is killed (using the normal thread kill
mechanism). Parallel replication then gracefully handles the kill (by
rollback and retry).

You can see in storage/xtradb/lock/lock0lock.cc how this is done for
InnoDB/XtraDB, eg. lock_report_waiters_to_mysql().

Hopefully it would be easy to hook this into TokuDB. It does require being
able to locate the transaction (and in particular the THD) that owns a given
lock. Another potential issue (at least it was for InnoDB/XtraDB) is that
thd_report_wait_for() can call back into the handlerton->kill_query method,
so the callor of thd_report_wait_for() needs to be prepared for this to
happen.

Note that we can modify/extend the thd_report_wait_for() API to work better
for TokuDB, if necessary. The current API was deliberately left "internal"
(not a service with public headerfile etc.) in anticipation that it might
need changing to better support other storage engines, such as TokuDB.

Also note that the call to thd_report_wait_for() does not need to happen
"just prior" to the lock wait - it can happen later, as long as it happens
at some point (though of course the earlier the better, in terms of more
quickly resolving the deadlock and allowing replication to proceed).

> I have been running sysbench oltp with a mariadb 10.1 master-slave
> topology.  I have not seen any replication errors when slave parallel mode
> is conservative.

No, it should not happen, because in conservative mode transactions are not
run in parallel on a slave unless they ran without lock conflicts on the
master (both transactions reached the commit point at the same time).

But in InnoDB/XtraDB, there are some interesting (but very rare) corner
cases where two transactions may or may not have lock conflicts depending on
the exact order of execution. So for these cases, the thd_report_wait_for()
mechanism is also needed.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-07-15 Thread Kristian Nielsen
Do you have a test case that can be used to repeat the bug?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-07-15 Thread Kristian Nielsen
jocelyn fournier  writes:

> Thanks for the quick answer! I wonder if it would be possible the
> automatically disable the optimistic parallel replication for an
> engine if it does not implement it ?

That would probably be good - though it would be better to just implement
the necessary API, it's a very small change (basically TokuDB just needs to
inform the upper layer of any lock waits that take place inside).

However, looking more at your description, you got a "key not found"
error. Not implementing the thd_report_wait_for() could lead to deadlocks,
but it shouldn't cause key not found. In fact, in optimistic mode, all
errors are treated as "deadlock" errors, the query is rolled back, and
run again, this time not in parallel.

So I'm wondering if there is something else going on. If transactions T1 and
T2 run in parallel, it's possible that they have a row conflict. But if T2
deleted a row expected by T1, I would expect T1 to wait on a row lock held
by T2, not get a duplicate key error. And if T1 has not yet inserted a row
expected by T2, then T2 would be rolled back and retried after T1 has
committed. The first can cause deadlock, but neither case seems to cause
duplicate error.

Maybe TokuDB is doing something special with locks around replication, or
something else goes wrong. I guess TokuDB just hasn't been tested much with
parallel replication.

Does it work ok when running in conservative parallel mode?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

2016-07-15 Thread Kristian Nielsen
jocelyn fournier  writes:

> After upgrading from TokuDB Enterprise with MariaDB 5.5 to MariaDB
> 10.1.14, I tried to enable the parallel replication
> (parallel_mode=optimistic, slave_parallel_threads=4) on a GTID enabled

> Is this a known limitation with TokuDB ?

Yes. TokuDB does not (to my knowledge) implement the thd_report_wait_for()
API, which is what makes optimistic parallel replication work.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Long email about a replication issue

2016-06-01 Thread Kristian Nielsen
 writes:

> This weekend I had to repair a replication problem on one of our
> clusters. I've attempted to get to the root cause but not sure where I

Is this pure MariaDB replication, or is it Galera? I think it is the former,
but the term "cluster" is somewhat overloaded, which is why I ask...

> The setup is M1<->M2 (with attached slaves). M1 is the active master
> receiving all writes. Access is controlled through an F5 and I don't
> think any errant transactions have occurred on the inactive master
> (M2). I've checked this by grepping the binlogs for the M2 server_id.
>
> The initial associated record that broke replication was attached to a
> "user" table record. This user was created on Friday at
> 16:21PM. Replication broke around 11:30PM that night. The user record
> had a GTID of GTID 0-1-36823254 (recovered from M1)
>
> I've looked into the appropriate binlog from M2...

> If I grep for the specific GTID on M2 I get nothing...

> If I grep for this record by email address I also get nothing. So I
> must conclude this record (and a bunch of others), never got to master

> until replication broke due to the FK errors. You would expect
> replication to break here because of a gap in the GTIDs. This did not
> happen and I'm almost certain that GTID replication could not have
> been deactivated and the positions messed around with.

Yeah, even if the slave was set to MASTER_USE_GTID=no, the GTIDs should
still have been there in the M2 binlog.

> I'm unsure of where to go now. Any ideas? Any thoughts are appreciated.

I guess you need to figure out why M2 did not apply those transactions. Some
suggestions:

 - Check the error log on M2 for disconnect/reconnects around the time of
   the transactions that are missing (or any disconects/reconnects). Such
   messages should also say at what position M2 disconnected and
   reconnected, this could be compared to the problem GTID. This could show
   if transactions were skipped because of reconnecting at a wrong position.

 - Also check for local slave stop/start message in the M2 error log, to see
   if anything looks related or could indicate changes in the replication
   config (most replication changes require stopping the slave threads).

 - You can also check the binlog on M1 for any out-of-order GTIDs, which
   could cause problems at slave reconnect (seems unlikely though).

 - Replication filtering could cause this - double-check that no filtering
   was turned on or something. Also stuff like --gtid-ignore-duplicates.

Good luck,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] HikariCP + searching archives

2016-04-16 Thread Kristian Nielsen
Stu Smith  writes:

>   I'm curious if there is a way to search the archives for this list - I

One way is to use google with an additional search term:

  site:lists.launchpad.net/maria-discuss

This restricts the search to the archives. The maria-developers@ list can be
similarly searched by adding a search term:

  site:lists.launchpad.net/maria-developers

Hope this helps,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Collations, trailing spaces and unique indexes

2016-03-11 Thread Kristian Nielsen
Binarus  writes:

> "All MySQL collations are of type PADSPACE. This means that all CHAR,
> VARCHAR, and TEXT values in MySQL are compared without regard to any
> trailing spaces. “Comparison” in this context does not include the

Yes, I have always found this terminally stupid as well. But I think it
comes from the SQL standard.

The only workaround I know of is to use VARBINARY instead of VARCHAR. I
think it works much the same in most respects. But obviously some semantics
is lost when the server no longer is aware of the character set used.

> Since the index behaviour obviously depends on the collation, would
> building an own collation which does not PADSPACE be an option? I have

That would be interesting, actually. I don't know what support there is for
non-PADSPACE collations. Maybe bar knows (Cc:'ed)?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] The insert performance issue

2016-03-10 Thread Kristian Nielsen
林澤宇  writes:

> I use a file include 10 insert command statements to test the insert
> performanc .
> On local server ,the Mariadb spent about 20 second to insert data;but on
> remote server ,the MariaDB spent about 30 second to insert data .

> Why the MariaDB has about 10 second gap ?
> Maybe the network should to cause some latancy ,but the time should not
> have so long .

Why do you think 10 seconds is unexpected?

If you are sending 10 individual queries from a single client thread,
that is 10 network roundtrips. Extra 10 seconds for that seems
reasonable, in fact it sounds quite fast to me (0.1 ms network roundtrip).

If you want better performance on remote server, try sending many statements
in one roundtrip using multi-statement protocol, or running many queries in
parallel from the client to overlap the roundtrips.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] replicate_rewrite_db as a system variable

2015-11-27 Thread Kristian Nielsen
Ian Gilfillan  writes:

> Is there a reason replicate_rewrite_db is not available as a system
> variable, while other similar settings, such as replicate_do_db are?

I don't know of any specific reason, except that it is not
implemented. Originally, none of these variables could be changed
dynamically. Davi Arnaut (if I remember correctly) implemented making the
filtering ones dynamic. replicate_rewrite_db was not part of that patch.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Enabling feedback pluging for MariaDB 10.1.4

2015-03-09 Thread Kristian Nielsen
Michael Widenius mo...@askmonty.org writes:

 for the alpha so I suggested Sergei today that we should enable it for
 the beta period of MariaDB 10.0

(10.*1* beta, I guess?)

 As most MariaDB users should know, the feedback is totally anonymous
 and no private or sensitive information is being sent.

 Any comments, suggestions or recommendations?

I think it is a bad idea. Please do not do it.

Phone-home is a misfeature in any product, and even more so in system
software like a database.

And besides, the information is much less useful than you think, because of
unknown, but probably extreme, data skew. In fact, it will probably be more
harmful than useful because people will use bad data to justify bad
decisions.

Experience supports this point of view with our download numbers. They do not
include apt-get / yum / etc. installations, which judging from IRC
conversations are the majority. Yet people continuely refer to them as though
they mean anything, just because they are there.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] The relay-log is not fluashed after the slave-relay-log.999999 showed

2015-01-05 Thread Kristian Nielsen
Gmail next1...@gmail.com writes:

 And, as I mentioned at the title of this question, the relay-log is not
 flushed after the slave-relay-log.99 showed when using
 Salve_parallel_threads:10 setting. like showed blow.

 - binlog_format: ROW
 - Slave_parallel_threads:10

 Everything are working fine except the slave-relay-log.** files
 continue to exist at the disk which will finally cause the disk full.
 If I change the value of Slave_parallel_threads setting from 10 to 0,
 the log will be flushed. Howevery PK duplicate warning error logs come
 next.

Ok, thanks for reporting this. It's probably a bug that parallel replication
behaves differently from non-parallel. I'll try to look into it when I have
time.

I could imagine that there are more bugs lurking when the log counter
overflows... I'm not sure this is well tested. I wonder what the correct
behaviour is? Should it just continue with slave-relay-log.100 ?

Thanks,

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] New Question: Multi Master Replication

2014-03-27 Thread Kristian Nielsen
AskMonty KB nore...@askmonty.org writes:

 What I am having a problem with is if I add a new master.  When I add the 
 master the data in the slave table is truncated and only the data from the 
 new master is replicated.  I loose all my old data in the slave.

The obvious guess is that the new master has DROP TABLE t1300,t13 in its
binlog. If this is the problem, then it can be solved by for example changing
the replication position to skip those events, or maybe by using replication
filters.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Question about MariaDB Non-blocking Client

2013-08-07 Thread Kristian Nielsen
Mike Gibson notmikegib...@gmail.com writes:

 Greetings,

Hi, sorry for taking a few days to get back to you:

 I'm using the MariaDB Non-blocking API to write a C++ client, but I've hit
 a wall regarding connection checking. I've been referencing the binding
 from node-mariasql (
 https://github.com/mscdex/node-mariasql/blob/master/src/binding.cc).

Ok, great that you can use the non-blocking API!

 The problem I'm experiencing is if I have a connection and kill the server
 (/etc/init.d/mysql stop) and then start it back (/etc/init.d/mysql start),
 I can never get clean reconnects. It's usually a mixture of errcode 2058,
 2003, 2013.

 I'm really confused how to gracefully manage connections. Before I was
 using the official MySQL C++ connector, and it provides a
 connection-isConnected() method. I'm wondering how I can get something
 similar with MariaDB's client, as I need the non-blocking interface.

I am not familiar with the MySQL C++ connector. I tried downloading the source
for mysql-connector-c++-1.1.3.tar.gz, but it does not seem to have any
isConnected() method?

Anyway, the usual way to do this is to issue a mysql_ping() to the server, to
check if the connection is working. With the non-blocking API, you would do
this with mysql_ping_start() and mysql_ping_cont(). This will issue a request
to the server to check if the connection is ok. If you have autoconnect
enabled (MYSQL_OPT_RECONNECT), then this will automatically reconnect if the
old connection was broken for some reason.

(Note that you will need to be aware of the usual issues with automatic
reconnect. For example, even if ping is successful, the connection may break
immediately afterwards. And an autoconnect will loose any existing state on
the connection such as temporary tables, SET @user_var, BEGIN, etc.)

 Given: MYSQL mysql, *mysqlHandle;

 Look at the mysql_real_connect_start() or cont() functions. I provide MYSQL
 struct and on connect I get a copy of the struct stored in *mysqlHandle.
 It's not clear to me what the purpose of the copy is at this point as I
 still use the initial struct to call query().

Agree, it is not very useful, it is just a copy of the pointer to your own
structure. You only need it to check for error (in which case the pointer will
be NULL). After that you can just you the pointer to your own struct that you
already have.

 In the case that I can detect a disconnection, how do I properly clean up
 the connection and attempt reconnect? Do I mysql_close(mysql) and/or
 mysql_close(mysqlHandle), shutdown/close the file descriptor, mysql_init()
 a new handle and go through mysql_real_connect_start/cont()?

I am not sure, but it seems to me from looking at the code that if you already
got an error that the connection was closed, then you can just go through the
mysql_real_connect_start/cont() sequence again to reconnect. But if that does
not work, you can always mysql_close() and mysql_init() your struct again.
There is no need to explicitly shutdown or close the file descriptor.

 Does it even make sense for each object to have its own MYSQL struct that I
 mysql_init(), or would it be better to have layer on top that mysql_init()s
 a single MYSQL struct, connects, and passes the returned *mysqlHandle to
 each query?

This depends on your application. You can only have a single query in progress
on one MYSQL struct at a time. So if you have multiple queries working at the
same time on the server, you need one MYSQL struct for each. If you only have
one query processing at a time, a single MYSQL struct will be sufficient.

 Thanks for providing the async client, any help is appreciated.

I hope this helps, though it is of a somewhat generic nature. If you need more
details, please ask again, perhaps with some example code that shows your
problems.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] openbsd switched libmysqlclient back to older mysql one

2013-06-13 Thread Kristian Nielsen
Colin Charles co...@montyprogram.com writes:

 I noticed this:
   
 http://www.openbsd.org/cgi-bin/cvsweb/ports/databases/mariadb/Makefile?rev=1.4

 Revert back to using MySQL 5.1 for the time being. MariaDB 5.5 introduces
 a new libmysqlclient non-blocking API which utilizes co-routines. The X86
 specific GCC ASM co-routine support hid the fact that there was an issue.
 The only fallback code so far is POSIX user contexts which OpenBSD does not
 support.

 Is this something we can fix to ensure that OpenBSD users can continue to use 
 our libmysqlclient?

The patch below disables the non-blocking API (all the calls will return
error) if no co-routine support is available.

But you will have to find someone else to:

 - Implement the CMake check (HAVE_UCONTEXT)

 - Fix the test suite to not attempt to the use non-blocking API when it is
   not available

 - Test this on various platforms

 - Merge it up from 5.5 through 10.0

 - Document it

(I am totally occupied dealing with our mad feature race in replication and
will not have time to clean this up).

To actually support the non-blocking client API on a given platform requires
co-routine support. The best option is asm stubs, but that needs to be done
anew for each architecture. There appears to be a way using sigaltstack
(MDEV-4601), but it's rather hackish. Another portable fallback possibility is
pthreads, but that is even more hackish. All of them requires some work to
integrate, though the meat of it should be possible to take from other
software that have been fixed for *BSD (for example qemu). Only the code in
my_context.c and my_context.h needs to be modified.

 - Kristian.

=== modified file 'client/mysqltest.cc'
--- client/mysqltest.cc 2013-04-17 17:42:34 +
+++ client/mysqltest.cc 2013-06-13 09:16:50 +
@@ -5933,7 +5933,8 @@ void do_connect(struct st_command *comma
 mysql_options(con_slot-mysql, MYSQL_OPT_CONNECT_TIMEOUT,
   (void *) opt_connect_timeout);
 
-  mysql_options(con_slot-mysql, MYSQL_OPT_NONBLOCK, 0);
+  if (mysql_options(con_slot-mysql, MYSQL_OPT_NONBLOCK, 0))
+die(Failed to initialise non-blocking API);
   if (opt_compress || con_compress)
 mysql_options(con_slot-mysql, MYSQL_OPT_COMPRESS, NullS);
   mysql_options(con_slot-mysql, MYSQL_OPT_LOCAL_INFILE, 0);

=== modified file 'include/my_context.h'
--- include/my_context.h2012-02-23 14:42:21 +
+++ include/my_context.h2013-06-13 09:20:28 +
@@ -31,8 +31,10 @@
 #define MY_CONTEXT_USE_X86_64_GCC_ASM
 #elif defined(__GNUC__)  __GNUC__ = 3  defined(__i386__)
 #define MY_CONTEXT_USE_I386_GCC_ASM
-#else
+#elif defined(HAVE_UCONTEXT)
 #define MY_CONTEXT_USE_UCONTEXT
+#else
+#define MY_CONTEXT_DISABLE
 #endif
 
 #ifdef MY_CONTEXT_USE_WIN32_FIBERS
@@ -103,6 +105,13 @@ struct my_context {
 };
 #endif
 
+
+#ifdef MY_CONTEXT_DISABLE
+struct my_context {
+  int dummy;
+};
+#endif
+
 
 /*
   Initialize an asynchroneous context object.

=== modified file 'mysys/my_context.c'
--- mysys/my_context.c  2012-11-04 21:20:04 +
+++ mysys/my_context.c  2013-06-13 09:00:28 +
@@ -726,3 +726,37 @@ my_context_continue(struct my_context *c
 }
 
 #endif  /* MY_CONTEXT_USE_WIN32_FIBERS */
+
+#ifdef MY_CONTEXT_DISABLE
+int
+my_context_continue(struct my_context *c)
+{
+  return -1;
+}
+
+
+int
+my_context_spawn(struct my_context *c, void (*f)(void *), void *d)
+{
+  return -1;
+}
+
+
+int
+my_context_yield(struct my_context *c)
+{
+  return -1;
+}
+
+int
+my_context_init(struct my_context *c, size_t stack_size)
+{
+  return -1;  /* Out of memory */
+}
+
+void
+my_context_destroy(struct my_context *c)
+{
+}
+
+#endif


___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] WL#4925 and multi-source replication

2013-03-15 Thread Kristian Nielsen
Greg skyg...@gmail.com writes:

 With kill -9 mysqld, and sync_binlog=0, I'm not really surprised since
 mysql will not fdatasync after each commit, right ?

Right. So this is mostly just my own academic interest, in practice it is of
course real crashes/powerfailures we want to handle, not SIGKILL.

If you are interested, this is my thinking: the server always does a write(2)
system call on the binlog at (group) commit, even with sync_binlog=0. So even
if we SIGKILL the server, the data is still in the kernel buffers (at least on
Linux), and will eventually reach disk.

However, you are using DRBD. I am guessing that when mysqld on one node dies,
a failover is done to the other node, and this looses data in the kernel disk
buffers on the first node that have not been fdatasync()'ed.

So I learned a bit about DRBD, thanks ;-)

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] WL#4925 and multi-source replication

2013-03-14 Thread Kristian Nielsen
Greg skyg...@gmail.com writes:

 I didn't think that sync_binlog=1 is required anymore for safe reliction.

 We are always using group commit in MariaDB 10.0 for the master, so
 the binary log will be synced for each group commit, which is safe.


 I have to use it in a DRBD config. When testing this config, I killed
 mysqld with SIGKILL and sync_binlog=0 and failover start binary log in old
 position, that caused duplicates on slaves.

 With sync_binlog=1, this happens no more.

 How to configure how often group commits are fdatasynced ?

Correct, sync_binlog=1 is still required to ensure consistency between binlog
and innodb. With sync_binlog=1, each group commit is fdatasynced (with a
single binlog fdatasync per group), no further configuration is needed.

You also need innodb_flush_log_at_trx_commit=1, of course.

I am a bit surprised that you got duplicates with SIGKILL of mysqld. I would
have expected crashing the OS kernel (ie. power failure) would be needed for
fdatasync to make any difference?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] testing Galera

2013-02-12 Thread Kristian Nielsen
Jan Kirchhoff j.kirchh...@logical-line.de writes:

 Is there something like slave_skip_counter, aka I Know what I do, skip
 that update? I think I have to take a new snapshot to get the second

Yes, MariaDB has this:


https://kb.askmonty.org/en/selectively-skipping-replication-of-binlog-events/

If you set skip_replication=0 (instead of SQL_LOG_BIN=0) and
replicate-events-marked-for-skip=FILTER_ON_MASTER on all servers. Then the
changes will be logged to the binlog (for Galera to use), but will not be sent
to other slaves using traditional replication.

Disclaimer: I do not have much experience with Galera, much less actually
tried using it with @@skip_replication, but it should work, I think.

Note that this is a MariaDB feature (it will not work with Galera based on
MySQL or Percona-server).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] mariasql client question (zappajs, coffeescript, nodejs)

2013-02-05 Thread Kristian Nielsen
klrumpf klru...@gmail.com writes:

 I now want to switch to the mariasql client...

Cool, that is an interesting project ...

 res.on(row, (row) -
   console.log Result row:  + inspect(row)

 @get '/': (req,res) - # LANDING PAGE)
result = doSql(select pakz from pkt where pktnr=10001)
console.log 919, result = ,result

 Console log in the doSql functions shows the results correctly but I
 have been
 playing around for a while and I can't get the function to return the
 result for
 further processing to the @get routine. Seems like I have not understood

I assume you mean that

 - You get output linis with Result row: ...

 - You get no line 919, result = , or it is empty.

Unfortunately, there may not be many on this list who are familiar with
node.js.

I have only general knowledge about event-driven programming. This leads me to
think that you need to pass a handler to your doSql() function and invoke this
handler in your res.on() handler. But I frankly admit I do not know if
something else could work with node.js.

 Thanks, hope this is the right list for this, Karl

You are welcome to try :-). Maybe someone with knowledge of node.js will
answer. But it might not be the best list, your problem looks more related to
general node.js than to MariaDB.

Hope this helps (a little),

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] MariaDB 5.6

2011-04-12 Thread Kristian Nielsen
Adam M. Dutko dutko.a...@gmail.com writes:

 Does the following item under the Performance section indicate
 optimizations for AMD chips are not being added?

  Better multi CPU performance above 16 cores (Work with Intel)

No (very unlikely).

The AMD and Intel x86_64 chips are quite similar, and most optimisations will
help either, certainly optimisations aimed at scaling to many cores.

The Work with Intel probably refers to the fact that Intel is helping the
work by donating many-core servers to MariaDB developers, and maybe with
advice/discussions.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] MariaDB 5.6

2011-04-09 Thread Kristian Nielsen
Peter Laursen peter_laur...@webyog.com writes:

 I noticed this on planet.mysql: http://kb.askmonty.org/v/plans-for-56

 I *again* strongly want to discourage a major version number identical with
 a MySQL/Oracle release.  MySQL plans a 5.6 too and I believe that there is
 already a source-tree available on launchpad. I think I understand that
 MariaDB 5.6 is planned to use MySQL 5.5 codebase.  Am I correct?

I don't think we would use 5.6 for MariaDB based on MySQL 5.5 (as opposed to
MySQL 5.6) codebase. That would be confusing indeed.

MariaDB 5.6 would be a version that included MySQL 5.6. If we need to release
what we now call 5.6 before MySQL 5.6 is released/stable, I think we would
need to come up something else...

Similarly, the next release of MariaDB will be called 5.5 or 5.3, depending on
whether we decide to include MySQL MySQL 5.5 or not.

 I have now posted this complaint 2 or 3 times (including the times I
 complained about the use of 5.2 for a mariaDB based on MySQL 5.1 code as
 there is alose a (now abandoned) MySQL 5.2 tree.  I never had a reply from a
 MariaDB 'decisionmaker'.  Could I at least request a reply this time?
 Please! :-)

I also do not think current version schema is perfect. However, do you have a
better suggestion?

The good thing about MariaDB 5.1, 5.2, 5.3, 5.5, 5.6, ... is that given
MariaDB X and MySQL Y, you can know that MariaDB X includes all of MySQL Y
(and thus can be used as a drop-in replacement) as long as X = Y.

Can we do something better that solves your concerns, and still preserves this
nice property in some way?

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] FreeBSD BuildBot Slave - I can offer one

2011-04-07 Thread Kristian Nielsen
Jakob Lorberblatt ja...@native-intelligence.net writes:

 I noticed that you do not have a regular build of FreeBSD for either
 common architecture (x86_64 or i386) I would be able to provide that for
 you, I have equipment to spare and would be willing to lend this to the
 project for building assistance; in addition I am fairly familiar with
 FreeBSD compilation in general so I can also help with any issues that

Thanks, sounds great!

 arise. I can begin configuring this host for the service, if I could have
 some direction towards what needs to be done in order for the MariaDB
 project at large to be able to make use of it. I believe I used to have
 some bookmarks of resources regarding the matter. However I don't have the
 references off the top of my head.

Adam's advice should hopefully get you started.

I will setup an account for your machine (64-bit to start with?) and send
details in private mail.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] MariaDB and Sun CoolThread

2011-04-04 Thread Kristian Nielsen
Alexandre Almeida alexan...@hti.com.br writes:

   As far as I can see MariaDB stay locked/running on a single
 virtual  CPU and MariaDB doesn't take advantage of CoolThread/CMT
 technologies,  I mean it can not run on more than one virtual CPU same
 time. Result:  poor performance.

   Anybody knows if there is something can I do to spread ;-)
 MariaDB  all over cpus? Am I missing something?

Each client connection will run on its own cpu. So to use all cpus your
application needs to run many queries in parallel.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Glibc 2.3 support in MariaDB builds ?

2010-12-19 Thread Kristian Nielsen
Jocelyn Fournier j...@doctissimo.fr writes:

 I'm trying to push my company to adopt MariaDB 5.2. Unfortunately, the
 currently available binaries (at least for the x86-64 version)
 requires a glibc 2.4, where we have only a 2.3 version (debian etch)
 installed on our server.
 MySQL 5.5 standard binaries, as well as Percona's one perfectly
 support glibc 2.3, so is there any plan to provide glibc 2.3
 compatible binaries for MariaDB ?

We used to have Debian etch binaries, however I removed them at some point
since Debian etch support ended half a year ago or so and there did not seem
much interest.

(If you want to help sponsor resurrecting binaries supporting such older
systems you could try contacting sa...@askmonty.org.)

Alternatively, you can build the binaries yourself, it is not hard. You
basically need to run these two commands:

BUILD/compile-bintar
scripts/make_binary_distribution

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] SphinxSE missing plug.in

2010-12-16 Thread Kristian Nielsen
Brian Evans grkni...@scent-team.com writes:

 MariaDB 5.2.4 tarball is missing the SphinxSE plug.in file in
 storage/sphinx.

 I'm not sure if this was intentional or not, so I thought I might
 point it out.

Oops :-(

Thanks a lot for pointing out this serious issue! I filed a bug for it
(https://bugs.launchpad.net/maria/+bug/691437) and will push a fix to be
included in 5.2.5.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] my_atomic_add64

2010-08-17 Thread Kristian Nielsen
Oleg Tsarev zabiva...@gmail.com writes:

 I need my_atomic_add64 in mysql.

 Can i simple add following macros, or i need more advanced tricks?

 tsa...@main:/storage/project/percona/rtd_2$ diff -Nur
 ../rtd/c/include/my_atomic.h c/include/my_atomic.h
 --- ../rtd/c/include/my_atomic.h2010-07-09 16:35:11.0 +0400
 +++ c/include/my_atomic.h   2010-08-17 18:57:07.648819066 +0400

I think you'll also need

make_transparent_unions(32)
#define U_32   int32
#define Uv_32  int32


 @@ -96,25 +96,30 @@
  make_atomic_cas( 8)
  make_atomic_cas(16)
  make_atomic_cas(32)
 +make_atomic_cas(64)
  make_atomic_cas(ptr)

 ...

My guess is it should work, at least on 64-bit platforms.

I'm not sure that 32-bit CPUs generally provide 64-bit atomic operations. If
not (which seems likely, really), you'll need to come up with something to
handle this case. Note that the my_atomic stuff has the possibility to
fallback to mutex locking when support is not available, so one way might be
to make it use this fallback on 32-bit (taking a performance penalty, but
32-bit is getting less and less interesting by the day anyway).

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-discuss] Why isn't SO_SNDTIMEO used?

2010-03-17 Thread Kristian Nielsen
 On Wed, Mar 10, 2010 at 12:29 PM, Michael Widenius mo...@askmonty.org wrote:
 We also use the thr_alarm() functionality when one uses 'kill
 connection-id' in MySQL.  I don't know of any easy way to gracefully
 wake up a thread that is sleeping on SO_SNDTIMEO. Do you?

Well, I checked the code, and it seems to wake up the thread using
pthread_kill(thread, signal) for the 'kill connection-id' command. This should
work fine also when using SO_SNDTIMEO for timeouts on the socket.

Just send the signal to the thread blocking on the socket with SO_SNDTIMEO,
and the blocking socket call will return with EAGAIN or similar.

 - Kristian.

___
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp