Hi Anthony,
1. I understand that using sequoia kind of assumes a shared-nothing disk
architecture. But is it possible to use it with a SAN instead of RAIDed?
If you use a SAN it means that your database supports shared-disk
clustering (like Oracle RAC) and already deals with the proper locking
and replication mechanism.
Can you even specify no RAID level?
If your database readily supports replication (e.g. using a SAN), then
you can use the ParallelDB load balancer that just forwards requests to
your database cluster. In this case, Sequoia brings you load balancing,
transparent failover and eventuelly request caching.
If so, does sequoia then have
nothing to do with this and all the reading/writing would just fall back
to MySQL's standards and locking along with whatever filesystem i'd be
using (probably GFS)? Would this even be possible or plausible?
Just to make sure that this is clear, Sequoia just relies on the
capabilities of the underlying databases. So if you try to use a SAN
with a database that has no support for it, Sequoia will not help in any
way.
2. In trying to understand the RAIDed architecture in regards to
sequoia, i'm not sure i understand something about the b-2 level. Does
sequoia determine which tables go on which servers? Or would I determine
that schema? What happens when an extra database server is introduced
into the mix? Does the schema re-partition the data automatically? Does
this need to be done manually?
In RAIDb-2, Sequoia will fetch the schema that is available on each
replica and proceed from there. If you want to force only specific
tables on each node, this can be specified manually in the virtual
database configuration file. The same thing applies when a new table
needs to be created, you can specify policies to create that table on a
specific set of nodes or choose dynamically nodes among a pool to select
the less loaded ones.
3. For nested raid (b-0-1/b-1-0), is this a common config, and if so,
can someone comment on how well this works in the real world, in regards
to performance and ease of recovery?
No this is not a common config and RAIDb-0 is rarely used because it
does not support distributed joins. If table A is on node 1 and table B
on node 2 and you try something like SELECT * FROM A,B you will get an
error that no node has both tables and the query cannot be executed.
4. In regards to the controller, where is this recommended to be placed?
on the app servers (tomcat), or the database servers?
There is no single answer to that. It depends how many app servers or
DBs you have. Sometimes you can collocate everything Tomcat/Sequoia/DB
and sometimes you use even dedicated machines for Sequoia controllers.
So all combinations are possible.
Thanks for your interest in Sequoia,
Emmanuel
--
Emmanuel Cecchet - Research scientist
EPFL - LABOS/DSLAB - IN.N 317
Phone: +41-21-693-7558
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia