Re: [Maria-developers] Review of MDEV-4506, parallel replication, part 1

2013-09-23 Thread Kristian Nielsen
Michael Widenius mo...@askmonty.org writes:

 Kristian We _do_ need a full memory barrier here (memory barrier is implied 
 in taking a
 Kristian mutex). Otherwise the compiler or CPU could re-order the setting of 
 the
 Kristian wakeup_subsequent_commits_running flag with the reads and writes 
 done in the
 Kristian list manipulations. This could cause the two threads to corrupt the 
 list as
 Kristian they both manipulate it at the same time.

 This I would like to understand better.
 (It always nice to know what the cpu/compiler is really doing)..

 In this case the code is looping over a list and just setting
 list-wakeup_subsequent_commits_running= false;

 I don't see how the compiler or CPU could reorder the code to things
 in different order (as we are not updating anything else than this
 flag in the loop).

 This is especially true as setting of
 cur-wakeup_subsequent_commits_running= true;
 is done within a mutex and that is the last write to this element of
 the list.

 So there is a barrier between we set it and potentially clear it.
 As the 'clear' may now happen 'any time' (from other threads point of
 view) I don't see why it needs to be protected.

Right.

I looked into this again. So we do need a memory barrier, I will try to
explain this more below. However, it turns out we already have this memory
barrier. Because in all relevant cases we call wait_for_commit::wakeup()
before we set the wakeup_subsequent_commits_running flag back to false. And
wakeup() takes a mutex, which implies a full memory barrier.

So I now removed the extra redundant locking (after adding comments explaining
why this is ok), and pushed that to 10.0-knielsen.

Let me see if I can explain why the memory barrier is needed. The potential
race (or one of them at least) is as follows:

Thread A is doing wakeup_subsequent_commits2(). Thread B is doing
unregister_wait_for_prior_commit2(). We assume B is on A's list of waiters.

A will do a read of B's next pointer, and a write of B's waiting_for_commit
flag. After that A will clear the wakeup_subsequent_commits_running flag.

B will read the wakeup_subsequent_commits_running flag, and if it is not set,
then it will remove itself from the list and later possibly put itself on the
list of another thread or whatever.

Without a memory barrier, the clearing of the flag might become visible to B
early. B could then remove itself from the list, and perhaps add itself to the
list of another thread. And then the write from A to B-waiting_for_commit
could become visible (which would cause a wrong spurious wakeup of B), or B's
writing of new next pointer could become visible to A's list traversal,
causing A to walk into the wrong list. (The latter problem sounds rather
unlikely, but at least theoretically it is allowed behaviour by compiler/CPU).

But due to the wakeup() call in A we do have a memory barrier between the list
manipulations and the clearing of the flag. And we have another memory barrier
in B (a full mutex lock actually) between checking the flag and manipulating
the list. This prevents the race.

 Kristian What we need to ensure is that

 Kristian  - All the reads of the list in thread 1 see only writes from other 
 threads
 Kristianthat happened _before_ the variable was reset.

 Kristian  - All changes to the list in thread 2 happen only _after_ the 
 variable was
 Kristianreset in thread 1.

 Kristian So a memory barrier is needed, but as you say, lock/unlock should 
 not be
 Kristian needed.

 I still don't understand the current code fully.

 The issue is that unregister_wait_for_prior_commit2() can be called by
 thread 2 just before or just after thread 1 is resetting
 wakeup_subsequent_commits_running.  There is not other variables
 involved.

 This means that in unregister_wait_for_prior_commit2() if thread 2 is there
 just before list-wakeup_subsequent_commits_running is set to false, we will 
 go
 into this code:

 if (loc_waitee-wakeup_subsequent_commits_running)
 {
   /*
 When a wakeup is running, we cannot safely remove ourselves from the
 list without corrupting it. Instead we can just wait, as wakeup is
 already in progress and will thus be immediate.

 See comments on wakeup_subsequent_commits2() for more details.
   */
   mysql_mutex_unlock(loc_waitee-LOCK_wait_commit);
   while (waiting_for_commit)
 mysql_cond_wait(COND_wait_commit, LOCK_wait_commit);
 }

 I don't however see any code in :queue_for_group_commit() that will
 signal COND_wait_commit.  What code do we have that will wake up the above
 code ?

In queue_for_group_commit() we call waiter-wakeup(), which signals
COND_wait_commit.

 Looking at wait_for_commit::wakeup() that makes this:

   mysql_mutex_lock(LOCK_wait_commit);
   waiting_for_commit= false;
   mysql_cond_signal(COND_wait_commit);
   mysql_mutex_unlock(LOCK_wait_commit);

 However, in
 wait_for_commit::register_wait_for_prior_commit()

 We set 

Re: [Maria-developers] Galera-10.0 still does not work with rollbacks

2013-09-23 Thread Jan Lindström

  
  
Hi,
  
  Full unedited log using --valgrind and --mysqld=--debug=+d at
  location: https://www.dropbox.com/s/dcufp7l1cch8dfs/mysql.err.gz
  
  In below lines I have:
  
  static int binlog_commit(handlerton *hton, THD *thd, bool all)
  {
    int error= 0;
    DBUG_ENTER("binlog_commit");
    binlog_cache_mngr *const cache_mngr=
      (binlog_cache_mngr*) thd_get_ha_data(thd, binlog_hton);
  #ifdef WITH_WSREP
    if (!cache_mngr) DBUG_RETURN(0);
  #endif /* WITH_WSREP */
  
      WSREP_ERROR("::jan_BINLOG_COMMIT:: TRX: Pending %p pos %llu,
  pos_in_file %llu current_pos %llu request_pos %llu write_pos
  %llu", cache_mngr-trx_cache.pending(),
  cache_mngr-trx_cache.cache_log.pos_in_file,
  cache_mngr-trx_cache.cache_log.current_pos,
  cache_mngr-trx_cache.cache_log.request_pos,
  cache_mngr-trx_cache.cache_log.write_pos);
      WSREP_ERROR("::jan_COMMIT:: STMT: STMT: Pending %p pos %llu,
  pos_in_file %llu current_pos %llu request_pos %llu write_pos
  %llu", cache_mngr-stmt_cache.pending(),
  cache_mngr-stmt_cache.cache_log.pos_in_file,
  cache_mngr-stmt_cache.cache_log.current_pos,
  cache_mngr-stmt_cache.cache_log.request_pos,
  cache_mngr-stmt_cache.cache_log.write_pos);
  
  
  And that macro is
  
  // MySQL logging functions don't seem to understand long long
  length modifer.
  // This is a workaround. It also prefixes all messages with
  "WSREP"
  #define WSREP_LOG(fun, ...)  
  \
      {
  \
      char msg[1024] = {'\0'}; 
  \
      snprintf(msg, sizeof(msg) - 1, ## __VA_ARGS__);  
  \
      fun("WSREP: %s", msg);   
  \
      }
  
  #define
  WSREP_DEBUG(...)    \
      if (wsrep_debug) WSREP_LOG(sql_print_information,
  ##__VA_ARGS__)
  #define WSREP_INFO(...)  WSREP_LOG(sql_print_information,
  ##__VA_ARGS__)
  #define WSREP_WARN(...)  WSREP_LOG(sql_print_warning,
  ##__VA_ARGS__)
  #define WSREP_ERROR(...) WSREP_LOG(sql_print_error,  
  ##__VA_ARGS__)
  
  Thus , it seems that trx_cache and stmt_cache are not initialized,
  where that should happen ?
  


  Jan Lindström jplin...@mariadb.org writes:


  
Correct log file attached.

  
  

  
==00:00:02:33.708 25289== Use of uninitialised value of size 8
==00:00:02:33.708 25289==at 0x5EF865B: _itoa_word (_itoa.c:179)
==00:00:02:33.708 25289==by 0x5EFCB91: vfprintf (vfprintf.c:1654)
==00:00:02:33.708 25289==by 0x5F22654: vsnprintf (vsnprintf.c:119)
==00:00:02:33.708 25289==by 0x5F03141: snprintf (snprintf.c:34)
==00:00:02:33.708 25289==by 0x936453: binlog_commit(handlerton*, THD*, bool) (log.cc:2054)
==00:00:02:33.708 25289==by 0x86B3DF: commit_one_phase_2(THD*, bool, THD_TRANS*, bool) (handler.cc:1516)
==00:00:02:33.708 25289==by 0x86B13B: ha_commit_trans(THD*, bool) (handler.cc:1436)
==00:00:02:33.709 25289==by 0x7A0DD9: trans_commit_stmt(THD*) (transaction.cc:363)
==00:00:02:33.709 25289==by 0x673DEB: mysql_execute_command(THD*) (sql_parse.cc:5373)
==00:00:02:33.709 25289==by 0x677D27: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6820)
==00:00:02:33.709 25289==by 0x6774A8: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6651)
==00:00:02:33.709 25289==by 0x6688A3: dispatch_command(enum_server_command, THD*, char*, unsigned int) (sql_parse.cc:1454)
==00:00:02:33.709 25289== 
==00:00:02:33.709 25289== Conditional jump or move depends on uninitialised value(s)
==00:00:02:33.709 25289==at 0x5EF8665: _itoa_word (_itoa.c:179)
==00:00:02:33.709 25289==by 0x5EFCB91: vfprintf (vfprintf.c:1654)
==00:00:02:33.709 25289==by 0x5F22654: vsnprintf (vsnprintf.c:119)
==00:00:02:33.709 25289==by 0x5F03141: snprintf (snprintf.c:34)
==00:00:02:33.709 25289==by 0x936453: binlog_commit(handlerton*, THD*, bool) (log.cc:2054)
==00:00:02:33.709 25289==by 0x86B3DF: commit_one_phase_2(THD*, bool, THD_TRANS*, bool) (handler.cc:1516)
==00:00:02:33.709 25289==by 0x86B13B: ha_commit_trans(THD*, bool) (handler.cc:1436)
==00:00:02:33.709 25289==by 0x7A0DD9: trans_commit_stmt(THD*) (transaction.cc:363)
==00:00:02:33.709 25289==by 0x673DEB: mysql_execute_command(THD*) (sql_parse.cc:5373)
==00:00:02:33.709 25289==by 0x677D27: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6820)
==00:00:02:33.709 25289==by 0x6774A8: wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:6651)
==00:00:02:33.709 25289==by 0x6688A3: dispatch_command(enum_server_command, THD*, char*, 

Re: [Maria-developers] Review of MDEV-4506, parallel replication, part 1

2013-09-23 Thread Kristian Nielsen
Michael Widenius mo...@askmonty.org writes:

 What should happen if you kill a replication thread is that
 replication should stop for that master.

 Kristian This needs more thought, I think ... certainly something looks not 
 right.

 After looking at the full code, I think that the logical way things
 should work is:

 'stop' is to be used when you want to nicely take done replication.
 This means that the current commit groups should be given time to
 finish.

 thd-killed should mean that we should stop ASAP.
 - All not commited things should abort.

 This is needed in a 'panic shutdown' (like out soon-out-of- power) or
 when trying to kill the replication thread when one notices that
 something went horribly wrong (like ALTER TABLE stopping replication).

Ok, so I thought more about this.

There are two kinds of threads involved. One is the normal SQL thread (this is
the only thread involved in non-parallel replication). There is a one-to-one
correspondence between an SQL thread and the associated master connection.

It seems to make sense that KILL CONNECTION on the SQL thread would stop the
replication. I do not know whether this is how it works or not in current
replication, but whatever it is, we should just leave the behaviour the same.

The other kind of thread is the parallel replication worker thread. We have a
pool of those (the size of the pool specified by --slave-parallel-threads).
These worker threads are not associated with any particular master
connection. They are assigned to execute an event group at a time for
whichever master connection is in need for a new thread.

So a KILL QUERY or KILL CONNECTION on a worker thread should abort the
currently executing event - and it will, I assume, using the existing code for
killing a running query operation. This will fail the execution of the query,
and thus eventually stop the associated SQL thread and master connection, once
the error is propagated out of the worker thread up to the parent SQL thread.

But KILL of a worker thread should _not_ cause the thread to exit, I
think. The pool of parallel threads is just a thread pool, and for simplicity
in the first version, it has a fixed size. A KILL will disconnect the thread
from the replication connection it was servicing, but the thread must remain
alive, ready to serve another connection if needed.

So the rpt-stop is only about maintaining the thread pool for parallel
replication, not about anything to do with aborting any specific master
connection replication. In rpl_parallel_change_thread_count() we set rpt-stop
to ask all existing threads to exit, and afterwards spawn a new set of
threads. This can only happen when all replication SQL/IO threads are stopped
and no event execution is taking place.

So this code of mine is actually wrong, I think:

  while (!rpt-stop  !thd-killed)

It should just be while (!rpt-stop) { ... }.
And thd-killed should not be used at all to control thread
termination. Instead, it should be used (and is not currently, needs to be
fixed) in various places for event execution to allow aborting currently
executing event. This is needed around rpt_handle_event() I think, and also
around queue_for_group_commit().

For normal stop, this is handled mostly in the SQL thread. It waits with a
timeout for the current event group to finish normally, then does a hard kill
if needed. This needs to be extended to handle running things in a worker
thread, of course. The code is around sql_slave_killed(), I think.

Does that make sense?

 +  mysql_mutex_lock(LOCK_wait_commit);
 +  waiting_for_commit= false;
 +  mysql_cond_signal(COND_wait_commit);
 +  mysql_mutex_unlock(LOCK_wait_commit);
 +}
 
 In this particular case it looks safe to move the cond signal out of
 the mutex. I don't see how anyone could miss the signal.

 Kristian Probably. Generally I prefer to write it as above, to not have to 
 make sure
 Kristian that moving it out is safe.

 I agree that this is the way to do in general.
 However for cases where we are waiting almost once per query, we need
 to make things faster as this extra wakup will take notable resources!

Ok, no problem, I've changed it.

I should benchmark this, it keeps coming up. It's not clear to me which way
would be the fastest, seems it would depend on the implementation.


 Hi!
 
 Part 2 of review of parallel replication
 
 === added file 'sql/rpl_parallel.cc'
 
 cut
 
 +pthread_handler_t
 +handle_rpl_parallel_thread(void *arg)
 +{
 
 cut
 
 +  if (end_of_group)
 +  {
 +in_event_group= false;
 +
 
 Add comment:
 
 /*
   All events for this group has now been executed and logged (but not
   necessaerly synced).
   Inform the other event groups that are waiting for this thread that
   we are done.
 */
 
 +rgi-commit_orderer.unregister_wait_for_prior_commit();
 +thd-wait_for_commit_ptr= NULL;
 +
 +/*
 +  Record that we have finished, so other event groups will no
 +  longer attempt to 

Re: [Maria-developers] [Roles] Final Status

2013-09-23 Thread Sergei Golubchik
Hi, Vicentiu!

On Sep 21, Vicentiu Ciorbaru wrote:
  Even though I didn't manage to implement everything yet, I want to
  finish this project and get it ready for an upstream push, outside
  of GSoC.
 
  1. I'd need either from you in email (but then cc: it to
  maria-developers) or in your blog (or both) a statement about the
  license. A simple one would be to say that this your code is available
  to us under BSD license. Another option is our MariaDB Contributor
  Agreement: https://mariadb.com/kb/en/mca/
 
 I hereby declare that all my work done so far (and future) by me on
 the Roles project, is released under the BSD licence.
 
 I will also make a statement on my blog in my wrap up post a bit later.
 
 If there are any more clarifications needed, let me know!

You mean the New BSD License, right? The original BSD License is not
compatible with GPL.

http://en.wikipedia.org/wiki/BSD_licenses

Regards,
Sergei


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] [Spatial] On current implementation approach

2013-09-23 Thread Mateusz Loskot
Hi,

I'm going to ask question about how the current Spatial Extensions are
implemented.
I have spent some time reading the source code in the current trunk
(spatial.h|cc, gcal*.h|cc, related Field and Item definitions, etc.),
so I have a rough understanding of the overall structure, how the
geometry data types are implemented and exposed to SQL, how the
spatial functions are defined and registered. I did not looked into
details of implementation of geospatial algorithms, but that's too low
level for the question I'm going to ask.

Initially, I was going to ask very detailed question, listing all the
relevant code definitions and asking separately about each of them,
but that's not necessary at this stage, I think.

Instead, I'm going to simplify and ask about the bigger picture, more
about MariaDB extensions API:

1. Is it possible to implement MariaDB extensions like Spatial (custom
type + set of functions) without such a tight coupling with the
internal implementation of the type system (without messing Field
class with geometry types directly, etc.)?

2. Is it possible to implement Spatial using User-Defined Functions
(UDF) defined in shared binary?

3. What is the reason behind using Well-Known-Binary (WKB) stream of
bytes to transport geometry values into/from functions? Is it due to
limitations of MariaDB type system where String is  the only universal
carrier for complex data? This concern is related to necessity of
encoding/decoding WKB when chaining spatial function calls, and
possibilities to avoid it.

Best regards,
-- 
Mateusz  Loskot, http://mateusz.loskot.net
Participation in this whole process is a form of torture ~~ Szalony

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Michael Widenius

Hi!

 Alexander == Alexander Barkov b...@mariadb.org writes:

Alexander Hi Monty,
Alexander On 09/17/2013 08:12 PM, Michael Widenius wrote:

cut

Alexander I added a reference to http://en.wikipedia.org/wiki/Base64,
Alexander as well all added these your comments in proper places of the code.
 
 Thanks. That will make the code much easier to understand.
 
 I just read trough the definition and noticed that some versions
 doesn't use '=' padding.
 
 Should we allow not '=' padding in our decoder too?
 (I think it's best to always pad on encoding).

Alexander 
http://dev.mysql.com/doc/refman/5.6/en/string-functions.html#function_to-base64
Alexander says:

Alexander   Different base-64 encoding schemes exist. These are the encoding 
and
Alexander   decoding rules used by TO_BASE64() and FROM_BASE64():
Alexander   ...
Alexander   * Encoded output consists of groups of 4 printable characters. 
Each 3
Alexander   bytes of the input data are encoded using 4 characters. If the 
last
Alexander   group is incomplete, it is padded with '=' characters to a length 
of
Alexander   4.

Alexander So we always pad on encoding.

This I tihnk is correct.

Alexander So do most of the modern pieces of software.

Alexander So does PHP:
Alexander http://php.net/manual/en/function.base64-encode.php

Alexander So does Linux bas64 program:

Alexander $ echo 123 |base64
Alexander MTIzCg==

Alexander This is why we always require pad characters on decoding.

This I don't necessary agree with.

One point of this function is to allow reading of coded text from
other programs, not only from mysqld. As some other programs are not
using padding, why would we not allow them ?

Alexander I don't think we should accept not-padded values.

I still don't understand why?
What is the disadvantage in that ?

The only benefit I see of requiring padding on decoding is that we
have 30% change to find out if a message was 'not complete'.

The disadvantage is that we would have to somewhere document
that we don't support base64 coding from a set of programs.

What should we tell the users of these programs when/if they complain
that they can't use MariaDB to read their data?

cut

Regards,
Monty

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Roberto Spadim
hi guys!
i used many base64 libs when i was developing to android, windows ce and
windows phone
some base64 don't work without == or = on the end of base64 encoded string
in that time i made a workaround with = at the end, based on the length()
of the string

well direct to the point...
base64_decode isn't ONE function, it could be implemented with many
standards... check wikipedia for example:

1)Original Base64 for Privacy-Enhanced Mail (PEM) (RFC 1421, deprecated)
2)Base64 transfer encoding for MIME (RFC 2045)
3)Standard 'Base64' encoding for RFC 3548 or RFC 4648
4)'Radix-64' encoding for OpenPGP (RFC 4880)
5)Modified Base64 encoding for UTF-7 (RFC 1642, obsoleted)
6)Modified Base64 for filenames (non standard)
7)Base64 with URL and Filename Safe Alphabet (RFC 4648 'base64url' encoding)
8)Non-standard URL-safe Modification of Base64 used in YUI Library (Y64)
9)Modified Base64 for XML name tokens (Nmtoken)
10)Modified Base64 for XML identifiers (Name)
11)Modified Base64 for Program identifiers (variant 1, non standard)
12)Modified Base64 for Program identifiers (variant 2, non standard)
13)Modified Base64 for Regular expressions (non standard)

I think the most used standard is rfc2045 (at least php use it and i think
perl and others script languages too), for rfc2045 we have:

Variant: Base64 transfer encoding for MIME (RFC 2045)
Char for index 62: +
Char for index 63: /
pad char: = (mandatory)
Fixed encoded line-length: No (variable)
Maximum encoded line length: 76
Line separators: CR+LF
Characters outside alphabet: Accepted (discarded)
Line checksum: (none)

in other words... the TO_BASE64 and FROM_BASE64, isn't a function with only
one parameter... i think we could allow two parameters and the second
parameters is standard default to RFC-2045, and we could accept: RFC1421,
RFC2045, (RFC3545/RFC4648), RFC4880, RFC1642, filenames, RFC4648, Y64,
NMTOKEN, name, variant1, variant2, regexp

what you think?
well with this we can report error, warning and others messages to help
user selecting the right one
i had a problem with wince5 and php, the problem was the char at position
62 one function was using - and the other was using +, well i think
this 13 variants of base64 function could explain the problem to everyone,
right?

thanks guys!


2013/9/23 Michael Widenius mo...@askmonty.org


 Hi!

  Alexander == Alexander Barkov b...@mariadb.org writes:

 cut

  Should we allow not '=' padding in our decoder too?
  (I think it's best to always pad on encoding).

 cut

 Alexander Just checked PostgreSQL. They also pad on encode,
 Alexander and do not accept non-properly padded values on decode:

 cut

 How about all the programs that don't padd with '='?
 There was a list of these in the wiki link you gave me ...

 Regards,
 Monty

 ___
 Mailing list: https://launchpad.net/~maria-developers
 Post to : maria-developers@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~maria-developers
 More help   : https://help.launchpad.net/ListHelp




-- 
Roberto Spadim
SPAEmpresarial
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Roberto Spadim
thanks alexander! :D i will read and test it as soon as possible :)


2013/9/23 Alexander Barkov b...@mariadb.org

 Hi Roberto,

 the patch has been pushed to the 10.0 tree:

 lp:~maria-captains/maria/10.0


 See here for more details about the 10.0 tree.
 https://launchpad.net/maria/**10.0 https://launchpad.net/maria/10.0


 On 09/17/2013 07:42 PM, Roberto Spadim wrote:

 nice =)
 after this patch commited, could report where i could get it? i will try
 to rewrite to base32 if possible :) and send at jira
 thanks!


 2013/9/17 Alexander Barkov b...@mariadb.org mailto:b...@mariadb.org


 Hello Roberto,



 On 09/17/2013 06:38 PM, Roberto Spadim wrote:

 hi guys!
 one question, as a mariadb user...
 base64 will be exposed to sql layer? could i use TO_BASE64() and
 FROM_BASE64() at mysql client?
 there's a MDEV in JIRA about it, if you could attach it to this
 patch
 could be nice: MDEV-4387
 https://mariadb.atlassian.__**net/browse/MDEV-4387

 
 https://mariadb.atlassian.**net/browse/MDEV-4387https://mariadb.atlassian.net/browse/MDEV-4387
 


 and with time, implement base32 and base16: MDEV-4800
 https://mariadb.atlassian.__**net/browse/MDEV-4800

 
 https://mariadb.atlassian.**net/browse/MDEV-4800https://mariadb.atlassian.net/browse/MDEV-4800
 


 base64 is very usefull, and many php users use it to
 send/receive
 files from database
 base32 from what i know is usefull only for OTP and other
 password apps
 base16 is a hexadecimal function HEX() and UNHEX()


 Yes, TO_BASE64() and FROM_BASE64() will be exposed to the SQL level,
 so one case use for example things like this:

 INSERT INTO t1 VALUES (FROM_BASE64('base64-string'))**__;


 or

 SELECT TO_BASE64(column) FROM t1;


 There are no plans to implement base32.


 thanks guys


 2013/9/17 Alexander Barkov b...@mariadb.org
 mailto:b...@mariadb.org mailto:b...@mariadb.org

 mailto:b...@mariadb.org


  Hi Monty,

  thanks for review.

  I have addressed most of your suggestions. See the new
 version attached,
  and the detailed comments inline:



  On 09/12/2013 04:32 PM, Michael Widenius wrote:


  Hi!

  Here is the review for the code that we should put into
 10.0

  First the base64:

  === modified file 'mysql-test/t/func_str.test'
  --- mysql-test/t/func_str.test  2013-05-07 11:05:09
 +
  +++ mysql-test/t/func_str.test  2013-08-28 13:14:24
 +
  @@ -1555,3 +1555,118 @@ drop table t1,t2;
 --echo # End of 5.5 tests
 --echo #

  +
  +--echo #
  +--echo # Start of 5.6 tests
  +--echo #


  Shouldn't this be start of 10.0 tests ?
  (I know that this code is from MySQL 5.6, but still for
 us this
  is 10.0...)


  This code is (almost) copy-and-paste from MySQL-5.6.
  I think it's a good idea when looking inside a test file
  to be able to see which tests are coming from MySQL-5.6,
  and which tests are coming from MariaDB-10.0.

  I'd suggest to keep Start of 5.6 tests in this particular
 case,
  and also when merging the tests for the other merged
 MySQL-5.6 features.




  === modified file 'mysys/base64.c'
  --- mysys/base64.c  2011-06-30 15:46:53 +
  +++ mysys/base64.c  2013-03-09 06:22:59 +
  @@ -1,5 +1,4 @@
  -/* Copyright (c) 2003-2008 MySQL AB, 2009 Sun
 Microsystems,
  Inc.
  -   Use is subject to license terms.
  +/* Copyright (c) 2003, 2010, Oracle and/or its
 affiliates.
  All rights reserved.


  Removed 'all rights reserved'. You can't have that for
 GPL code.

  If there is any new code from us, please add a
 copyright message
  for the
  MariaDB foundation too!


  Removed 'all rights reserved' and added MariaDB, as there
 are some
  our own changes.




This program is free software; you can
 redistribute it
  and/or modify
it under the terms of the GNU General Public
 License
  as published by
  @@ -25,6 +24,28 @@ static char base64_table[] =
 ABCDEFGHIJ

 abcdefghijklmnopqrstuvwxyz
  

Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Alexander Barkov

Hi Monty,


On 09/23/2013 07:16 PM, Michael Widenius wrote:


Hi!


Alexander == Alexander Barkov b...@mariadb.org writes:


cut


Should we allow not '=' padding in our decoder too?
(I think it's best to always pad on encoding).


cut

Alexander Just checked PostgreSQL. They also pad on encode,
Alexander and do not accept non-properly padded values on decode:

cut

How about all the programs that don't padd with '='?
There was a list of these in the wiki link you gave me ...


It was a list of various variants of base64 that have existed through 
the time:


http://en.wikipedia.org/wiki/Base64

I believe the modern base64 is associated with:
- Char for index 62 = +
- Char for index 63 = -
- Pad character = mandatory

Look at the table. All variants that have + and -
for Char for index 62 and Char for index 63
have also Pad character = mandatory.

The only exception is:
Modified Base64 encoding for UTF-7 (RFC 1642, obsoleted).
But it's obsoleted long time ago!


For those who wants the ancient modified Base64 for UTF-7,
there is a simple workaround:

mysql set @a:='aa';
mysql select from_base64(rpad(@a,cast(length(@a)/4 as signed)*4,'='));

+--+
| from_base64(rpad(@a,cast(length(@a)/4 as signed)*4,'=')) |
+--+
| i|
+--+
1 row in set (0.00 sec)





Regards,
Monty



___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Roberto Spadim
Sorry if i'm posting too much, it's my last post for this ...

intbase64_decode_max_arg_length(){#if (SIZEOF_INT == 8)
  return 0x7FFFLL;#else
  return 0x7FFF;#endif}

if we reeturn LL (longlong) with a int function, this will cause
overflow/warnings at gcc? will the long long be returned with the right
value?

intmain(void){

base64.c is a binary program too? could we compile it as mariadb_base64 ?
this help debuging something like ENABLE-BASE64-BINARY, or some README file
showing how to do this


well :) that's all :) at weekend i report if i found something strange or
bugs :)
bye
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [Roles] Final Status

2013-09-23 Thread Vicentiu Ciorbaru
Hi Sergei!

On Mon, Sep 23, 2013 at 5:23 PM, Sergei Golubchik s...@mariadb.org wrote:
 Hi, Vicentiu!

 On Sep 21, Vicentiu Ciorbaru wrote:
  Even though I didn't manage to implement everything yet, I want to
  finish this project and get it ready for an upstream push, outside
  of GSoC.

  1. I'd need either from you in email (but then cc: it to
  maria-developers) or in your blog (or both) a statement about the
  license. A simple one would be to say that this your code is available
  to us under BSD license. Another option is our MariaDB Contributor
  Agreement: https://mariadb.com/kb/en/mca/

 I hereby declare that all my work done so far (and future) by me on
 the Roles project, is released under the BSD licence.

 I will also make a statement on my blog in my wrap up post a bit later.

 If there are any more clarifications needed, let me know!

 You mean the New BSD License, right? The original BSD License is not
 compatible with GPL.

 http://en.wikipedia.org/wiki/BSD_licenses

Yes! I've read about the BSD licence before sending the first email,
but I missed that nuance.

With the 3 clause BSD licence is with what I'm releasing my code.

Regards,
Vicențiu

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of base64.diff and german2.diff

2013-09-23 Thread Alexander Barkov


On 09/23/2013 09:25 PM, Roberto Spadim wrote:

Sorry if i'm posting too much, it's my last post for this ...

int
base64_decode_max_arg_length()
{
#if (SIZEOF_INT == 8)
   return  0x7FFFLL;
#else
   return  0x7FFF;
#endif
}


Oops. Some remainders after moving from size_t to int
(to avoid changes in the replication code which relies on int).

SIZEOF_INT can never actually be 8.
So perhaps #if is not really needed.




if we reeturn LL (longlong) with a int function, this will cause
overflow/warnings at gcc? will the long long be returned with the right
value?

int
main(void)
{

base64.c is a binary program too? could we compile it as mariadb_base64
? this help debuging something like ENABLE-BASE64-BINARY, or some README
file showing how to do this


It is not a binary program.
I guess the original author of this file used this main() for test 
purposes. main() could probably be removed.






well :) that's all :) at weekend i report if i found something strange
or bugs :)
bye



___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] The Usability of Free/Libre/Open Source Projects, A Review

2013-09-23 Thread Jan Lindström

  
  
In

http://www.ijcit.com/archives/volume2/issue5/Paper020519.pdf

For some reason MySQL/MariaDB is not mentioned.

R:

  --
  Jan Lindstrm
Principal Engineer
  
  MariaDB | MaxScale | skype: jan_p_lindstrom
  www.skysql.com
  

  

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of MDEV-4506, parallel replication, part 1

2013-09-23 Thread Michael Widenius

Hi!

 Kristian == Kristian Nielsen kniel...@knielsen-hq.org writes:

cut

 So there is a barrier between we set it and potentially clear it.
 As the 'clear' may now happen 'any time' (from other threads point of
 view) I don't see why it needs to be protected.

Kristian Right.

Kristian I looked into this again. So we do need a memory barrier, I will try 
to
Kristian explain this more below. However, it turns out we already have this 
memory
Kristian barrier. Because in all relevant cases we call 
wait_for_commit::wakeup()
Kristian before we set the wakeup_subsequent_commits_running flag back to 
false. And
Kristian wakeup() takes a mutex, which implies a full memory barrier.

Kristian So I now removed the extra redundant locking (after adding comments 
explaining
Kristian why this is ok), and pushed that to 10.0-knielsen.

Kristian Let me see if I can explain why the memory barrier is needed. The 
potential
Kristian race (or one of them at least) is as follows:

Kristian Thread A is doing wakeup_subsequent_commits2(). Thread B is doing
Kristian unregister_wait_for_prior_commit2(). We assume B is on A's list of 
waiters.

Thanks for spending time explaning this!

This one is clear.

What was not clear how the code that only goes and resets
wakeup_subsequent_commits_running for all threads could cause a
problem.

However, things are good enough so we can leave the code according to
your latest version.

cut

Kristian Hope this helps,

yes, it helped a lot. Thanks!

Regards,
Monty

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Review of MDEV-4506, parallel replication, part 1

2013-09-23 Thread Michael Widenius

Hi!

 Kristian == Kristian Nielsen kniel...@knielsen-hq.org writes:

Kristian Michael Widenius mo...@askmonty.org writes:
 What should happen if you kill a replication thread is that
 replication should stop for that master.

cut

Kristian It seems to make sense that KILL CONNECTION on the SQL thread would 
stop the
Kristian replication. I do not know whether this is how it works or not in 
current
Kristian replication, but whatever it is, we should just leave the behaviour 
the same.

This is how things are now (as far as I know).

cut

Kristian But KILL of a worker thread should _not_ cause the thread to exit, I
Kristian think. The pool of parallel threads is just a thread pool, and for 
simplicity
Kristian in the first version, it has a fixed size. A KILL will disconnect the 
thread
Kristian from the replication connection it was servicing, but the thread must 
remain
Kristian alive, ready to serve another connection if needed.

Correct. The KILL command only kills the query or the connection (ie,
THD). The thread is always reused.

Kristian So the rpt-stop is only about maintaining the thread pool for 
parallel
Kristian replication, not about anything to do with aborting any specific 
master
Kristian connection replication. In rpl_parallel_change_thread_count() we set 
rpt-stop
Kristian to ask all existing threads to exit, and afterwards spawn a new set of
Kristian threads. This can only happen when all replication SQL/IO threads are 
stopped
Kristian and no event execution is taking place.

Yes. The questions is when rpt-stop should take effect.

My suggestion is that the replication threads should only examine
rpt-stop after commits.  This way we can ensure that when all threads
are stopped, everything is committed to a certain point and we never
have to do a rollback.

Kristian So this code of mine is actually wrong, I think:

Kristian   while (!rpt-stop  !thd-killed)

Kristian It should just be while (!rpt-stop) { ... }.
Kristian And thd-killed should not be used at all to control thread
Kristian termination. Instead, it should be used (and is not currently, needs 
to be
Kristian fixed) in various places for event execution to allow aborting 
currently
Kristian executing event. This is needed around rpt_handle_event() I think, 
and also
Kristian around queue_for_group_commit().

Yes.  I will be working on this today  tomorrow.

Note that every loop where we wait needs to be stoppable, either with
'stop' or with KILL.

Kristian For normal stop, this is handled mostly in the SQL thread. It waits 
with a
Kristian timeout for the current event group to finish normally, then does a 
hard kill
Kristian if needed. This needs to be extended to handle running things in a 
worker
Kristian thread, of course. The code is around sql_slave_killed(), I think.

Kristian Does that make sense?

Makes sence. I will know more tomorrow when I have dug into the code a
bit more.

 +  mysql_mutex_lock(LOCK_wait_commit);
 +  waiting_for_commit= false;
 +  mysql_cond_signal(COND_wait_commit);
 +  mysql_mutex_unlock(LOCK_wait_commit);
 +}
 
 In this particular case it looks safe to move the cond signal out of
 the mutex. I don't see how anyone could miss the signal.
 
Kristian Probably. Generally I prefer to write it as above, to not have to 
make sure
Kristian that moving it out is safe.
 
 I agree that this is the way to do in general.
 However for cases where we are waiting almost once per query, we need
 to make things faster as this extra wakup will take notable resources!

Kristian Ok, no problem, I've changed it.

Kristian I should benchmark this, it keeps coming up. It's not clear to me 
which way
Kristian would be the fastest, seems it would depend on the implementation.

I did a benchmark of this a long time ago (+10 years) on Linux and
then using the signal after unlock was notable faster.

The gain is two thread switches + one thread wakeup per cond_signal.



 Hi!
 
 Part 2 of review of parallel replication
 
 === added file 'sql/rpl_parallel.cc'
 
 cut
 
 +pthread_handler_t
 +handle_rpl_parallel_thread(void *arg)
 +{
 
 cut
 
 +  if (end_of_group)
 +  {
 +in_event_group= false;
 +
 
 Add comment:
 
 /*
 All events for this group has now been executed and logged (but not
 necessaerly synced).
 Inform the other event groups that are waiting for this thread that
 we are done.
 */
 
 +rgi-commit_orderer.unregister_wait_for_prior_commit();
 +thd-wait_for_commit_ptr= NULL;
 +
 +/*
 +  Record that we have finished, so other event groups will no
 +  longer attempt to wait for us to commit.
 +
 +  We can race here with the next transactions, but that is fine, as
 +  long as we check that we do not decrease last_committed_sub_id. If
 +  this commit is done, then any prior commits will also have been
 +  done and also no longer need waiting for.
 +*/
 +mysql_mutex_lock(entry-LOCK_parallel_entry);
 +if 

Re: [Maria-developers] [Spatial] On current implementation approach

2013-09-23 Thread Mateusz Loskot
On 23 September 2013 22:10, Alexey Botchkov holyf...@askmonty.org wrote:

 1. Is it possible to implement MariaDB extensions like Spatial (custom
 type + set of functions) without such a tight coupling with the
 internal implementation of the type system (without messing Field
 class with geometry types directly, etc.)?


 Yes, it is possible. The core algorithms are separated from the Field
 structure and any other database internals.
 They are placed in sql/gcalc_slicescan.cc and sql/gcalc_tools.cc files.

Yes, but my question is not really about location of computational geometry
bits, but about the data management: SQL data type for geometry objects,
input/output routines.

Due to my lack of experience with MariaDB/MySQL UDF, I simply assumed that if:
1. Field is the only place that defines GEOMETRY type (and there is no
CREATE TYPE support)
2. UDF prototypes will use of GEOMETRY in their prototypes to declare
input/output parameters
then I couldn't understand how it is possible to remove geometry
definitions from Field
and other internal definitions.

But, I've just found this project [1] with extra spatial UDFs, so I
think I understand the UDF
protocol regarding I/O arguments would not require explicit GEOMETRY type
making it possible to move Spatial Extensions completely out of
built-ins (trunk/sql/ files).

[1] https://github.com/krandalf75/MySQL-Spatial-UDF

 2. Is it possible to implement Spatial using User-Defined Functions
 (UDF) defined in shared binary?


 The spatial functions/operations can be implemented with UDF, but
 that makes query optimization and using Spatial keys problemmatic.

So, for real use case, the idea I brainstormed above would not make sense.
Unless, there is workaround for those problems you mean.

 3. What is the reason behind using Well-Known-Binary (WKB) stream of
 bytes to transport geometry values into/from functions? Is it due to
 limitations of MariaDB type system where String is  the only universal
 carrier for complex data? This concern is related to necessity of
 encoding/decoding WKB when chaining spatial function calls, and
 possibilities to avoid it.


 The reason was mostly historical. It was sufficient for the first
 implementations of the Geometry field types and somewhat convenient as
 we don't need to perform conversions
 when we need to import/export features in their WKB representation.
 But yes, that format is inefficient and difficult to handle properly. I plan
 to get rid of it internally - only support importing-exporting it.

I roughly understand, but how do you plan to pass geometry data around,
in what format?

AFAIU, it is not possible to pass user-defined types into/from SQL functions,
so geometries would have to be passed as String objects anyway, wouldn't they?
IOW, there are only 3 types available (integer, real, string), so
String is the only one
usable to pass geometry objects around, regardless of actual encoding format,
WKB, WKT, any other binary stream...

It means, that if I want to pass geometry to my_foo UDF:

MSUDF_API char*
my_foo(UDF_INIT *initid,UDF_ARGS *args, char *buf, unsigned long
*length, char *is_null, char *error);

the only option available is to make geometry into a kind of stream of bytes
and passed as one of args item.
So, a kind of serialising/deserialising is in fact unavoidable.

Is my understanding correct?

Best regards,
-- 
Mateusz  Loskot, http://mateusz.loskot.net
Participation in this whole process is a form of torture ~~ Szalony

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [Spatial] On current implementation approach

2013-09-23 Thread Roberto Spadim
Hi mateusz, i'm a user and a hobby developer... i will answer with what i
know


2013/9/23 Mateusz Loskot mate...@loskot.net

 Hi,

 I'm going to ask question about how the current Spatial Extensions are
 implemented.
 I have spent some time reading the source code in the current trunk
 (spatial.h|cc, gcal*.h|cc, related Field and Item definitions, etc.),
 so I have a rough understanding of the overall structure, how the
 geometry data types are implemented and exposed to SQL, how the
 spatial functions are defined and registered. I did not looked into
 details of implementation of geospatial algorithms, but that's too low
 level for the question I'm going to ask.

 Initially, I was going to ask very detailed question, listing all the
 relevant code definitions and asking separately about each of them,
 but that's not necessary at this stage, I think.

 Instead, I'm going to simplify and ask about the bigger picture, more
 about MariaDB extensions API:

 1. Is it possible to implement MariaDB extensions like Spatial (custom
 type + set of functions) without such a tight coupling with the
 internal implementation of the type system (without messing Field
 class with geometry types directly, etc.)?


i think this is something interesting, check this idea:
https://mariadb.atlassian.net/browse/MDEV-4912
well this MDEV is too far from today (it's an idea to mariadb 10.1 we are
at 10.0.5 today...)
forget this MDEV... you can extend mariadb with your hands too :)




 2. Is it possible to implement Spatial using User-Defined Functions
 (UDF) defined in shared binary?

yes, the field type probably will be a WKB value and a STRING/BLOB (char)
type, must check i never used a UDF for spatial data




 3. What is the reason behind using Well-Known-Binary (WKB) stream of
 bytes to transport geometry values into/from functions? Is it due to
 limitations of MariaDB type system where String is  the only universal
 carrier for complex data? This concern is related to necessity of
 encoding/decoding WKB when chaining spatial function calls, and
 possibilities to avoid it.


internally (at memory / disk) you have a string (for spatial type it's a
WKB format string), and at run/compile/design time you have a class (many
functions) that handle this type (string in this case)

check that at cpu we have only primitive types (java work with something
similar to this too), in other words at memory we have
string/uint/boolean/structs/etc, there's no 'geometric' type at hardware
level...

if you have contact with microcontrollers or linux kernel source or gcc
source, you will know how a C++ language is 'converted' to a
assembler/machine language, and you will understand how a C++ object is
implemented at low level...

in low words: a class in any computer language is: protected memory +
protected functions that handle this memory

about why use WKB or other type? well i don't know if we have hardware
(cpu) optimizations here, but at least we have a small memory footprint
(and probably a smaller cpu consume), some languages/cpu have optimizations
to handle for example a math function using int*int, check about SIMD at
wikipedia (http://en.wikipedia.org/wiki/SIMD), or maybe you remember the
pentium with MMX, it's a hardware optimization for some especific
operations, check SSE, SSE2, etc... the goal is a faster execution, it's
not a limit at mysql/mariadb source code, but a optimization knowing how a
C/C++ language work at low level is the best way to optimize memory and cpu



 Best regards,
 --
 Mateusz  Loskot, http://mateusz.loskot.net
 Participation in this whole process is a form of torture ~~ Szalony

 ___
 Mailing list: https://launchpad.net/~maria-developers
 Post to : maria-developers@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~maria-developers
 More help   : https://help.launchpad.net/ListHelp



well i don't know if every thing i wrote is ok hehe, but it's a high level
guide line :)


-- 
Roberto Spadim
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] [Spatial] On current implementation approach

2013-09-23 Thread Roberto Spadim
hi again :)
i'm not a mariadb team developer, so please consider me as an user/udf
developer =]

2013/9/23 Mateusz Loskot mate...@loskot.net

 On 23 September 2013 22:10, Alexey Botchkov holyf...@askmonty.org wrote:
 
  1. Is it possible to implement MariaDB extensions like Spatial (custom
  type + set of functions) without such a tight coupling with the
  internal implementation of the type system (without messing Field
  class with geometry types directly, etc.)?
 
 
  Yes, it is possible. The core algorithms are separated from the Field
  structure and any other database internals.
  They are placed in sql/gcalc_slicescan.cc and sql/gcalc_tools.cc files.

 Yes, but my question is not really about location of computational geometry
 bits, but about the data management: SQL data type for geometry objects,
 input/output routines.

 Due to my lack of experience with MariaDB/MySQL UDF, I simply assumed that
 if:
 1. Field is the only place that defines GEOMETRY type (and there is no
 CREATE TYPE support)


create type probably will be a 10.1 feature:
https://mariadb.atlassian.net/browse/MDEV-4912
and maybe you will not have a spatial key optimization in the first version
of this feature

in my opnion if you start a new udf today with gis, you should use the WKB
+ a second lib (geos is very good) to handle spatial data
geos can use the WKB with a fast unserialize: GEOSGeomFromWKB_buf



 2. UDF prototypes will use of GEOMETRY in their prototypes to declare
 input/output parameters
 then I couldn't understand how it is possible to remove geometry
 definitions from Field
 and other internal definitions.

 But, I've just found this project [1] with extra spatial UDFs, so I
 think I understand the UDF
 protocol regarding I/O arguments would not require explicit GEOMETRY type


yes, you don't have a GEOMTRY_TYPE for arg_type[] at udf
check your example at your mysql-spatial-udf git project:

my_bool msudf_within_init(UDF_INIT *initid,UDF_ARGS *args,char *message)
...

*   args-arg_type[0] = STRING_RESULT;*
...


long long msudf_within(UDF_INIT *initid,UDF_ARGS *args,char *is_null,
char *error)
...

geom1 = msudf_getGeometry((unsigned char *)args-args[0],args-lengths[0]); 


set arg_type to STRING_RESULT, and use a cast (unsigned char *) to handle
raw geometry data


making it possible to move Spatial Extensions completely out of
 built-ins (trunk/sql/ files).

 [1] https://github.com/krandalf75/MySQL-Spatial-UDF



mariadb 10.0 have plans about OPENGIS:
https://mariadb.com/kb/en/plans-for-10x/#opengis-compliance
but i didn't found JIRA report about it, or another worklog or something
similar (must check if it's in lauchpad bug track or another lauchpad
branch)
and i don't know if mariadb will use GEOS... but from what i know, geos is
the best opengis lib today, why not use it at mariadb?! =)




  2. Is it possible to implement Spatial using User-Defined Functions
  (UDF) defined in shared binary?
 
 
  The spatial functions/operations can be implemented with UDF, but
  that makes query optimization and using Spatial keys problemmatic.

 So, for real use case, the idea I brainstormed above would not make sense.
 Unless, there is workaround for those problems you mean.


well i don't know what problemmatic means at high/low level, but i think
it's something like this at sql layer:

WHERE udf_function(x)
in theory this udf_function() could be optimized with rtree index Y...
but it will do a table scan... optimizer don't know how to use index with
udf functions yet :(
check that some internal functions don't have optimizations too, like:
SUBSTRING(indexed_field,1,4)='abcd' could be rewrite as (indexed_field LIKE
abcd% OR indexed_field='abcd')


a workaround about index should be done at application side, could be
something like:
WHERE udf_function(x) and other_builtin_function_that_use_index()
with this other_builtin_function_that_use_index function (envelop
funciont for example) , you could use the spatial index and optimize the
query... but it's not the best solution at server side, but the only i
can think as a udf developer :)

well if you know how to code at mysql/mariadb server side... you can patch
the optimizer, but i think it's a hard work, optimizer is black magic to me
yet =]



  3. What is the reason behind using Well-Known-Binary (WKB) stream of
  bytes to transport geometry values into/from functions? Is it due to
  limitations of MariaDB type system where String is  the only universal
  carrier for complex data? This concern is related to necessity of
  encoding/decoding WKB when chaining spatial function calls, and
  possibilities to avoid it.
 
 
  The reason was mostly historical. It was sufficient for the first
  implementations of the Geometry field types and somewhat convenient as
  we don't need to perform conversions
  when we need to import/export features in their WKB representation.
  But yes, that format is inefficient and difficult to handle properly. I
 plan
  to get