Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-24 Thread Stewart Smith
Arjen Lentz ar...@openquery.com writes:
 Possibly a separate task, but have you guys considered supporting the
 Galera replication system for Drizzle?

It shouldn't be too much of a hastle... it's just (of course) a matter
of developer resources :)

 It may even be possible to have mixed Drizzle and MariaDB in a
 cluster.

It may be possible to do something like that... the devil is in the
details of course :)


-- 
Stewart Smith


pgpKcfMece5ki.pgp
Description: PGP signature
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-22 Thread kuldeep porwal
Hello,

Andrew sir: Completely missed that. Thanks :) .  csum feature is introduced
in libdrizzle last december I think. So not much blogs on it .

Stewart sir: If transaction.proto covers all possible DDL and DML changes
then we won't change it.


On Mon, Apr 22, 2013 at 11:26 AM, Andrew Hutchings
and...@linuxjedi.co.ukwrote:

 On 22/04/13 01:28, Stewart Smith wrote:

 kuldeep porwal 2591kuld...@gmail.com writes:

 If in future if we find some DDL inconsistency or any other issue that
 *may* require changing transaction.proto then we should modify it and
 keep
 that under different version. As this will obviously help in entirely
 independent working of our module and we don't affect Drizzle slave or
 applier at the same time.


 We should not need to modify transaction.proto at all as it can already
 be used to express all DDL and DML changes possible to apply to Drizzle.

  I wouldn't worry too much about it at this stage, we could attempt the
 SQL

 and just error out if it doesn't apply.
 Yeah great! I introduced Checksum and DDL heuristics just as a part of
 proposal. We have to create basic prototype first then we will keep on
 improving it.


 There shouldn't be any place to add in checksum, we can
 support/notsupport the MySQL binlog checksum for reading.


 And Libdrizzle 5.1 already supports that in its binlog API :)

 Kind Regards
 --
 Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/




-- 
Regards,
Kuldeep Porwal
IIIT Hyderabad
09550605256
http://web.iiit.ac.in/~kuldeep.porwal
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-22 Thread Andrew Hutchings

On 22/04/13 08:09, kuldeep porwal wrote:

Hello,

Andrew sir: Completely missed that. Thanks :) .  csum feature is
introduced in libdrizzle last december I think. So not much blogs on it .


It was around then, yes.  It was blogged about on my own blog.  If you 
are implementing a different way there are some things to note:


1. All events will have a 4byte CRC32 on the end
2. The CRC32 is not rolling through the log, it starts from 0 for each event
3. The client needs to set a user variable (I forget what it is called 
now) to say it supports checksums, otherwise the connection is rejected
4. The client should assume that connections to 5.6.1 servers onwards 
have checksumming support


Note that I'm not happy about the way the above was implemented in MySQL 
(it was very easy to do in a backwards compatible way) but at least they 
finally have it now.


Kind Regards
--
Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/

___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread kuldeep porwal
Hello,
  w.r.t to my last mail I have following modification in my
proposal .

1. I have looked into *how drizzle replication works* and got the following
information:

For DML we have GPB based 'transaction.proto' which is then converted to
it's  class file which solves the low-level problems of serializing and
deserializing, and  'versioning'  formatted binary streams of data so that
raw data can change its structure without having to re-implement all new
serialization and deserialization routines. Drizzle has ReplicationService
which converts the row change event into a Transaction Message Transaction
Context, Statement 1...n. Now this message is read by reader plugin and
then applier plugin replicate it on the slave (subscriber).


*2. To Replicate from MySQL to Drizzle (Gsoc Idea): *

This process is quiet simple. We read the MySQL row based binlog from the
publisher (master), then we *form* the new Transaction Message using
a slightly modified version of 'transaction.proto' file. This Event of
forming Transaction Message will be CALLED in the * binlog_event of
libdrizzle-redux binlog api *. Once we have the Transaction Message, reader
plugin will push it to publisher plugin on master and then the salve
machine (subscriber) will read and replicate it via Transaction Applier.

The Trickiest part is that we are using all the Replication modules of
Drizzle but to take MySQL binlog and covert them into Drizzle's Transaction
message we are building a NEW module.

*So at last we have reduced the problem* into  a module which converts
MySQL binlog into Drizzle's Transaction Message. Possible required changes:

1. As MySQL binlogs are row based, we need to change or introduce new
'Blueprints' for data in proto file (this file will be a modified copy of
current transaction.proto).
2. Also the log can be big so to provide the error checking ans safe
replication we can add CHECKSUM.
3. We can use some kind of Heuristics to remove DDL based inconsistencies
between MySQL and Drizzle.

please verify the proposal.


On Sat, Apr 20, 2013 at 7:42 PM, kuldeep porwal 2591kuld...@gmail.comwrote:

 Hi Sir,

  Start at ways to read the MySQL binlog, which may be part
 of libdrizzle-redux as well as teh binlog api code that's been
 sitting around.

 1. Checked libdrizzle-redux binlog api along with binary logs for MySQL.
 2. I am now quiet familiar with new binlog api of libdrizzle and also
 tested the code which gets binlog file from MySQL (Row Based form) and
 prints the event associated in it.
 3. For the GSOC project point of view
  3.1 Although RBR is now more in use and its safe too, we can infact
 take any replication format SBR or RBR to replicate MySQL database.
  3.2  I think of two ways for this
 a)  As I read  about drizzle slave replication, it does
 multi-master replication.
  (on master:) Transaction -- InnoDB --- (on slave :)
  Replication reader --- Replication Applier
  Here we can take MySQL binlogs then from that we can form
 logs like InnoDB (mentioned above) and finally the replication process will
 be
  same as drizzle salve replication.

  b) we can take binlogs from MySQL and write  new Replicater
 Reader and Applier for it. It will be more like MySQL way IO thread and
 MySQL thread.


 About DDL part: Yes it may cause some problem but that can be solved
 easily by some heuristic and prototyping. I am thinking to work on this
 part later after results :-).

 Please suggest: Am I going in right direction or not? Also what should I
 write in my GSOC-Proposal.


 On Thu, Apr 18, 2013 at 1:26 PM, Stewart Smith 
 stew...@flamingspork.comwrote:

 kuldeep porwal 2591kuld...@gmail.com writes:
  Thanks for the idea. But row-based replication (RBR) was implemented in
  MySQL 5.1.5, so you cannot replicate using row-based replication from
 any
  MySQL 5.0 or later master to a slave older than MySQL 5.1.5. That
 shouldn't
  be the problem for us, is it?

 Requiring 5.1 or later is fine. Not everything would be able to be
 easily replicated either - after all, DDL could be problematic, as would
 data that is invalid (0th month/year etc).

  I will look into the depth for this and get back to you.

 Start at ways to read the MySQL binlog, which may be part of
 libdrizzle-redux as well as teh binlog api code that's been sitting
 around.

 --
 Stewart Smith




 --
 Regards,
 Kuldeep Porwal
 IIIT Hyderabad
 09550605256
 http://web.iiit.ac.in/~kuldeep.porwal




-- 
Regards,
Kuldeep Porwal
IIIT Hyderabad
09550605256
http://web.iiit.ac.in/~kuldeep.porwal
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread Stewart Smith
kuldeep porwal 2591kuld...@gmail.com writes:
 *2. To Replicate from MySQL to Drizzle (Gsoc Idea): *

 This process is quiet simple. We read the MySQL row based binlog from the
 publisher (master), then we *form* the new Transaction Message using
 a slightly modified version of 'transaction.proto' file. This Event of
 forming Transaction Message will be CALLED in the * binlog_event of
 libdrizzle-redux binlog api *. Once we have the Transaction Message, reader
 plugin will push it to publisher plugin on master and then the salve
 machine (subscriber) will read and replicate it via Transaction
 Applier.

I think we should take a slightly different approach. Instead of
changing the Drizzle replication log format, we read the MySQL binlog
and immediately convert into Drizzle transaction messages... this means
that the process of connecting to mysqld and converting to drizzle
transaction message is entirely independent of the drizzle slave apply
code.

 The Trickiest part is that we are using all the Replication modules of
 Drizzle but to take MySQL binlog and covert them into Drizzle's Transaction
 message we are building a NEW module.

yes, doing the conversion will be slightly tricky, we can start off easy
and not care about DDL or data types we don't support. i.e. replicate a
workload that would work against Drizzle anyway.

 1. As MySQL binlogs are row based, we need to change or introduce new
 'Blueprints' for data in proto file (this file will be a modified copy of
 current transaction.proto).

I don't think we need this, we just need to parse MySQL binlog and
convert into drizzle transaction.proto, we won't have to change it at all.

 2. Also the log can be big so to provide the error checking ans safe
 replication we can add CHECKSUM.
 3. We can use some kind of Heuristics to remove DDL based inconsistencies
 between MySQL and Drizzle.

I wouldn't worry too much about it at this stage, we could attempt the
SQL and just error out if it doesn't apply.


-- 
Stewart Smith


pgp2annKEBCA0.pgp
Description: PGP signature
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread kuldeep porwal
Hello Sir,

  we just need to parse MySQL binlog and convert into drizzle
transaction.proto, we won't have to change it at all.

Yeah I was trying to convey the same thing but in a different manner. As I
mentioned
 1. As MySQL binlogs are row based, we need to change or introduce new  
 'Blueprints'
for data in proto file (this file will be a modified copy of
 current transaction.proto).

If in future if we find some DDL inconsistency or any other issue that
*may* require changing transaction.proto then we should modify it and keep
that under different version. As this will obviously help in entirely
independent working of our module and we don't affect Drizzle slave or
applier at the same time.

 I wouldn't worry too much about it at this stage, we could attempt the SQL
and just error out if it doesn't apply.
Yeah great! I introduced Checksum and DDL heuristics just as a part of
proposal. We have to create basic prototype first then we will keep on
improving it.

Apart from this if you have any other thing in mind that I should write in
my proposal then please share.


On Sun, Apr 21, 2013 at 9:48 PM, Stewart Smith stew...@flamingspork.comwrote:

 kuldeep porwal 2591kuld...@gmail.com writes:
  *2. To Replicate from MySQL to Drizzle (Gsoc Idea): *
 
  This process is quiet simple. We read the MySQL row based binlog from the
  publisher (master), then we *form* the new Transaction Message using
  a slightly modified version of 'transaction.proto' file. This Event of
  forming Transaction Message will be CALLED in the * binlog_event of
  libdrizzle-redux binlog api *. Once we have the Transaction Message,
 reader
  plugin will push it to publisher plugin on master and then the salve
  machine (subscriber) will read and replicate it via Transaction
  Applier.

 I think we should take a slightly different approach. Instead of
 changing the Drizzle replication log format, we read the MySQL binlog
 and immediately convert into Drizzle transaction messages... this means
 that the process of connecting to mysqld and converting to drizzle
 transaction message is entirely independent of the drizzle slave apply
 code.

  The Trickiest part is that we are using all the Replication modules of
  Drizzle but to take MySQL binlog and covert them into Drizzle's
 Transaction
  message we are building a NEW module.

 yes, doing the conversion will be slightly tricky, we can start off easy
 and not care about DDL or data types we don't support. i.e. replicate a
 workload that would work against Drizzle anyway.

  1. As MySQL binlogs are row based, we need to change or introduce new
  'Blueprints' for data in proto file (this file will be a modified copy of
  current transaction.proto).

 I don't think we need this, we just need to parse MySQL binlog and
 convert into drizzle transaction.proto, we won't have to change it at all.

  2. Also the log can be big so to provide the error checking ans safe
  replication we can add CHECKSUM.
  3. We can use some kind of Heuristics to remove DDL based inconsistencies
  between MySQL and Drizzle.

 I wouldn't worry too much about it at this stage, we could attempt the
 SQL and just error out if it doesn't apply.


 --
 Stewart Smith




-- 
Regards,
Kuldeep Porwal
IIIT Hyderabad
09550605256
http://web.iiit.ac.in/~kuldeep.porwal
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread Stewart Smith
kuldeep porwal 2591kuld...@gmail.com writes:
 If in future if we find some DDL inconsistency or any other issue that
 *may* require changing transaction.proto then we should modify it and keep
 that under different version. As this will obviously help in entirely
 independent working of our module and we don't affect Drizzle slave or
 applier at the same time.

We should not need to modify transaction.proto at all as it can already
be used to express all DDL and DML changes possible to apply to Drizzle.

 I wouldn't worry too much about it at this stage, we could attempt the SQL
 and just error out if it doesn't apply.
 Yeah great! I introduced Checksum and DDL heuristics just as a part of
 proposal. We have to create basic prototype first then we will keep on
 improving it.

There shouldn't be any place to add in checksum, we can
support/notsupport the MySQL binlog checksum for reading.


-- 
Stewart Smith


pgpXxR7sqlbQx.pgp
Description: PGP signature
___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread Arjen Lentz
Hi Stu, Kuldeep

Possibly a separate task, but have you guys considered supporting the Galera 
replication system for Drizzle?
It may even be possible to have mixed Drizzle and MariaDB in a cluster.

Async repl, even the best possible implementation, still leaves write-scaling 
and failover/resilience/redundancy issues.
Async repl has its place, but with Galera on the scene it's clear that for many 
(if not most) new cases, Galera may be the best architecture, with async repl 
having an auxiliary role for tasks such as geographically displaced slaves (not 
active-active DCs), in-office reporting and such.

In light of that, I'd see more sense in spending time on utilising the Galera 
system into Drizzle than spending new time on async repl jazz.
Just my 2c.

Cheers,
Arjen.


- Original Message -
 From: Stewart Smith stew...@flamingspork.com
 To: kuldeep porwal 2591kuld...@gmail.com
 Cc: drizzle-discuss@lists.launchpad.net
 Sent: Monday, 22 April, 2013 10:28:09 AM
 Subject: Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with 
 Google Protobuffers

 kuldeep porwal 2591kuld...@gmail.com writes:
  If in future if we find some DDL inconsistency or any other issue
  that
  *may* require changing transaction.proto then we should modify it
  and keep
  that under different version. As this will obviously help in
  entirely
  independent working of our module and we don't affect Drizzle slave
  or
  applier at the same time.
 
 We should not need to modify transaction.proto at all as it can
 already
 be used to express all DDL and DML changes possible to apply to
 Drizzle.
 
  I wouldn't worry too much about it at this stage, we could attempt
  the SQL
  and just error out if it doesn't apply.
  Yeah great! I introduced Checksum and DDL heuristics just as a part
  of
  proposal. We have to create basic prototype first then we will keep
  on
  improving it.
 
 There shouldn't be any place to add in checksum, we can
 support/notsupport the MySQL binlog checksum for reading.


-- 
Arjen Lentz, Exec.Director @ Open Query (http://openquery.com)
Australian peace of mind for your MySQL/MariaDB infrastructure.

Follow us at http://openquery.com/blog/  http://twitter.com/openquery


___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp


Re: [Drizzle-discuss] GSOC'13 MySQL and drizzle replication with Google Protobuffers

2013-04-21 Thread Andrew Hutchings

On 22/04/13 01:28, Stewart Smith wrote:

kuldeep porwal 2591kuld...@gmail.com writes:

If in future if we find some DDL inconsistency or any other issue that
*may* require changing transaction.proto then we should modify it and keep
that under different version. As this will obviously help in entirely
independent working of our module and we don't affect Drizzle slave or
applier at the same time.


We should not need to modify transaction.proto at all as it can already
be used to express all DDL and DML changes possible to apply to Drizzle.


I wouldn't worry too much about it at this stage, we could attempt the SQL

and just error out if it doesn't apply.
Yeah great! I introduced Checksum and DDL heuristics just as a part of
proposal. We have to create basic prototype first then we will keep on
improving it.


There shouldn't be any place to add in checksum, we can
support/notsupport the MySQL binlog checksum for reading.


And Libdrizzle 5.1 already supports that in its binlog API :)

Kind Regards
--
Andrew Hutchings - LinuxJedi - http://www.linuxjedi.co.uk/

___
Mailing list: https://launchpad.net/~drizzle-discuss
Post to : drizzle-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp