We ran into a nasty problem with sharding a while back. We came up with an effective workaround, but we don't fully understand the problem. Perhaps someone here can provide more insight.
We shard requests to our production MySQL databases using sqlalchemy.ext.horizontal_shard.ShardedSession. Our query_chooser uses a consistent hashing scheme on our account_ids to compute shard_ids. In almost all cases, we know the account_id before performing a query, therefore a query goes directly to the correct shard and only to that shard. In the other cases, we're typically trying to find the account_id(s) given some other key. The nasty problem arose thus. We made a cross-shard query, which apparently opened a database connection to more than one of the sharded databases. The query returned an account_id and our associated SQLAlchemy Account object. We then added data to the account and committed it — all on the same scoped_session. Result: the thread (process?) blocks on the connections to the other databases. The workaround is to call commit() on the session after performing the cross-shard query, and then call scoped_session() to make a fresh session before adding data to the account. Anyone understand what the underlying problem was? Is it in SQLAlchemy, MySQL-python, or some lower level? The pattern that we're moving towards with these cross-shard queries is to perform them in a distinct session using a contextmanager which commits the session. Account_ids and other keys are returned from the contextmanager, but SQLAlchemy objects are not. One (or more) of our SQLAlchemy Account objects are then created in the main session. I'm worried about the >1 case, since those accounts may be distributed across several shards. Is this safe? -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/VH_WL7tlmv4J. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
