We ran into a nasty problem with sharding a while back. We came up with an 
effective workaround, but we don't fully understand the problem. Perhaps 
someone here can provide more insight.

We shard requests to our production MySQL databases using 
sqlalchemy.ext.horizontal_shard.ShardedSession. Our query_chooser uses  a 
consistent hashing scheme on our account_ids to compute shard_ids. In 
almost all cases, we know the account_id before performing a query, 
therefore a query goes directly to the correct shard and only to that 
shard. In the other cases, we're typically trying to find the account_id(s) 
given some other key.

The nasty problem arose thus. We made a cross-shard query, which apparently 
opened a database connection to more than one of the sharded databases. The 
query returned an account_id and our associated SQLAlchemy Account object. 
We then added data to the account and committed it — all on the same 
scoped_session. Result: the thread (process?) blocks on the connections to 
the other databases.

The workaround is to call commit() on the session after performing the 
cross-shard query, and then call scoped_session() to make a fresh session 
before adding data to the account.

Anyone understand what the underlying problem was? Is it in SQLAlchemy, 
MySQL-python, or some lower level?

The pattern that we're moving towards with these cross-shard queries is to 
perform them in a distinct session using a contextmanager which commits the 
session. Account_ids and other keys are returned from the contextmanager, 
but SQLAlchemy objects are not. One (or more) of our SQLAlchemy Account 
objects are then created in the main session. I'm worried about the >1 
case, since those accounts may be distributed across several shards.

Is this safe?

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sqlalchemy/-/VH_WL7tlmv4J.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to