Re: [Labs-l] Accessing the databases from labs - A comparison with the toolserver

Platonides Fri, 12 Jul 2013 12:23:35 -0700

On 12/07/13 20:24, Marc A. Pelletier wrote:

On 07/12/2013 01:59 PM, Platonides wrote:

These connections are cached, so if I connected to fiwiki
and then to eowiki, the same db object would be returned.


I don't think that any putative gain of performance or resources this
would give is worth the added complexity;

On a simple benchmarking, reconnecting for each db is about 4 timesslower, but that may not be representative.


php script.php $(grep 192.168.99.3 /etc/hosts | cut -c 14-)

<?php
array_shift($argv);
$m = null;
foreach ($argv as $arg) {
        if (!$m) # <-- Comment this line
                $m = mysql_connect($arg, 'username', 'password');
        if (!$m) die(1);
        mysql_select_db($arg, $m);
}


> but if you insisted on doing
> that caching, you should do it by the actual cluster IP, not the name
> you used since only the former is guaranteed to be valid in all cases.
>
> (Well, strictly speaking, only the [host,port] tuple is, but all the
> ports will always remain the same since we do portmapping)

Except that we could have several IPs per cluster (as TS does).

In this particular case, it's not avoidable (for user databases).

Why?


We use a double underscore as the guaranteed cannot-occur-in-a-username
mark.  But beyond that, the usernames are different because we don't
handle credentials the same way.  Since the allowable databases are
derived from the username, then the database names are necessarily
different (picking between "u_foo" and "u_p12345g12345" is no harder
than having to pick between "u_foo" and "p12345g12345__foo" --
hardcoding the dabase name breaks either way).

Except for remembering when you are typing the name manually, but thekey is: why 12345 and not 'foo' ?

There is also a grid engine on TS.


Yes, but while its use has been recently been made mandatory, I think,
most tools do not in fact use it and need to be adapted.

It has been available (and highly encouraged) for a long time, but Ican't comment about the amount of sge vs plain cron used on TS.

It's not possible to specify a
cluster as requisite in labs, BTW.


That would be because, by design, any execution node has access to every
cluster.  Having this be a requested resource would be akin to being
able to request that your job be provided with a CPU to execute it.  :-)

-- Marc

It's very nice to have an environment where the database servers arealways available, jobs are not affected by maria maintenances and dbsare so ubiquitous that not even sge needs to know about what is used :)


_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Re: [Labs-l] Accessing the databases from labs - A comparison with the toolserver

Reply via email to