RE: [WebSimpleDB] Allowing schema operations anywhere
My apologies for my late reply, I've been out for a while. -Original Message- From: Nikunj R. Mehta [mailto:nikunj.me...@oracle.com] Sent: Friday, December 11, 2009 10:47 AM To: public-webapps@w3.org WG Cc: Pablo Castro Subject: Re: [WebSimpleDB] Allowing schema operations anywhere I have gone ahead and updated the spec to allow option B (only). Please take a look. Option B makes sense, as without it there is a class of algorithms that cannot be implemented or it would be quite difficult to do so (e.g. a sort type of construct a query language might want to support wouldn't be possible without a backing index). This certainly means versioning becomes the responsibility of the app/library and not the user agent. This makes sense to me, given that not all schema changes are really version changes (e.g. creation of a spill-to-disk table shouldn't bump up the database version). Thanks -pablo Nikunj On Dec 8, 2009, at 10:14 AM, Nikunj R. Mehta wrote: Hi Pablo, Sorry for the long delay in responding to your comments. Hopefully, we can continue the discussion now. Schema changes interact with the locking model of the database. As I see it, here are several ways in which the API could be designed and the consequences of doing so: A. Allow schema changes inside a metadata transaction which can only be performed at connection time B. Allow schema changes inside a data transaction, which can be performed any time a connection is open C. Allow schema changes inside a metadata transaction, which can be performed any time a connection is open Option A's disadvantages are that metadata manipulation cannot be combined with data changes. Moreover, version numbers are no longer issued by the application but rather by a user agent. Option A's advantages are that resource acquisition is simplified and deadlocks can be avoided considering that a connection acquires and releases the metadata resource in a consistent sequence. Another upside is that version number maintenance is automated. Option B's main disadvantage is that there is no real notion of version that can be managed by the user agent. Another is that deadlocks could occur because there is no a priori declaration of intent about metadata modification. This could be remedied by including the database itself in the list of objects that are intended to be modified in the transaction. Option B's advantages are closer interleaving of and atomic metadata changes with data changes, and application controlled version numbers used for the database. Option C's disadvantage is that data and metadata changes cannot be interleaved atomically. Option C's advantages are that deadlocks can be avoided and version number management can be performed by an application. Overall, I think version management and metadata changes are exclusive in some sense. IOW, if we want Option B and Option C, then we have to remove the connection time version check. Hope that helps. Please feel free to add if I missed anything. Nikunj On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote: We are finding a number of reasons for wanting to create tables on the fly, and without bumping up the database version. A few examples: - Packaged components that create side tables to maintain its own state - Query processors often need to spill to disk during query execution. For example, sorting large sets requires storing temporary sets of rows on disk to be merged later. So we're thinking it would be better to have these methods directly in the DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), instead of their current location in the Upgrade interface. For the common case where several schema changes need to be done atomically, developers can simply wrap the calls in a transaction, and they would do for regular data manipulation. We would need an extra method to bump up the version explicitly, as that would no longer be in the upgrade callback. Does this seem reasonable? Regards, -pablo Nikunj http://o-micron.blogspot.com Nikunj http://o-micron.blogspot.com
Re: [WebSimpleDB] Allowing schema operations anywhere
I have gone ahead and updated the spec to allow option B (only). Please take a look. Nikunj On Dec 8, 2009, at 10:14 AM, Nikunj R. Mehta wrote: Hi Pablo, Sorry for the long delay in responding to your comments. Hopefully, we can continue the discussion now. Schema changes interact with the locking model of the database. As I see it, here are several ways in which the API could be designed and the consequences of doing so: A. Allow schema changes inside a metadata transaction which can only be performed at connection time B. Allow schema changes inside a data transaction, which can be performed any time a connection is open C. Allow schema changes inside a metadata transaction, which can be performed any time a connection is open Option A's disadvantages are that metadata manipulation cannot be combined with data changes. Moreover, version numbers are no longer issued by the application but rather by a user agent. Option A's advantages are that resource acquisition is simplified and deadlocks can be avoided considering that a connection acquires and releases the metadata resource in a consistent sequence. Another upside is that version number maintenance is automated. Option B's main disadvantage is that there is no real notion of version that can be managed by the user agent. Another is that deadlocks could occur because there is no a priori declaration of intent about metadata modification. This could be remedied by including the database itself in the list of objects that are intended to be modified in the transaction. Option B's advantages are closer interleaving of and atomic metadata changes with data changes, and application controlled version numbers used for the database. Option C's disadvantage is that data and metadata changes cannot be interleaved atomically. Option C's advantages are that deadlocks can be avoided and version number management can be performed by an application. Overall, I think version management and metadata changes are exclusive in some sense. IOW, if we want Option B and Option C, then we have to remove the connection time version check. Hope that helps. Please feel free to add if I missed anything. Nikunj On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote: We are finding a number of reasons for wanting to create tables on the fly, and without bumping up the database version. A few examples: - Packaged components that create side tables to maintain its own state - Query processors often need to spill to disk during query execution. For example, sorting large sets requires storing temporary sets of rows on disk to be merged later. So we're thinking it would be better to have these methods directly in the DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), instead of their current location in the Upgrade interface. For the common case where several schema changes need to be done atomically, developers can simply wrap the calls in a transaction, and they would do for regular data manipulation. We would need an extra method to bump up the version explicitly, as that would no longer be in the upgrade callback. Does this seem reasonable? Regards, -pablo Nikunj http://o-micron.blogspot.com Nikunj http://o-micron.blogspot.com
Re: [WebSimpleDB] Allowing schema operations anywhere
Hi Pablo, Sorry for the long delay in responding to your comments. Hopefully, we can continue the discussion now. Schema changes interact with the locking model of the database. As I see it, here are several ways in which the API could be designed and the consequences of doing so: A. Allow schema changes inside a metadata transaction which can only be performed at connection time B. Allow schema changes inside a data transaction, which can be performed any time a connection is open C. Allow schema changes inside a metadata transaction, which can be performed any time a connection is open Option A's disadvantages are that metadata manipulation cannot be combined with data changes. Moreover, version numbers are no longer issued by the application but rather by a user agent. Option A's advantages are that resource acquisition is simplified and deadlocks can be avoided considering that a connection acquires and releases the metadata resource in a consistent sequence. Another upside is that version number maintenance is automated. Option B's main disadvantage is that there is no real notion of version that can be managed by the user agent. Another is that deadlocks could occur because there is no a priori declaration of intent about metadata modification. This could be remedied by including the database itself in the list of objects that are intended to be modified in the transaction. Option B's advantages are closer interleaving of and atomic metadata changes with data changes, and application controlled version numbers used for the database. Option C's disadvantage is that data and metadata changes cannot be interleaved atomically. Option C's advantages are that deadlocks can be avoided and version number management can be performed by an application. Overall, I think version management and metadata changes are exclusive in some sense. IOW, if we want Option B and Option C, then we have to remove the connection time version check. Hope that helps. Please feel free to add if I missed anything. Nikunj On Nov 22, 2009, at 3:14 PM, Pablo Castro wrote: We are finding a number of reasons for wanting to create tables on the fly, and without bumping up the database version. A few examples: - Packaged components that create side tables to maintain its own state - Query processors often need to spill to disk during query execution. For example, sorting large sets requires storing temporary sets of rows on disk to be merged later. So we're thinking it would be better to have these methods directly in the DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), instead of their current location in the Upgrade interface. For the common case where several schema changes need to be done atomically, developers can simply wrap the calls in a transaction, and they would do for regular data manipulation. We would need an extra method to bump up the version explicitly, as that would no longer be in the upgrade callback. Does this seem reasonable? Regards, -pablo Nikunj http://o-micron.blogspot.com
Re: [WebSimpleDB] Allowing schema operations anywhere
On 11/22/09 3:14 PM, Pablo Castro wrote: Does this seem reasonable? This seems reasonable to me. The use case makes sense, and is something I've seen many times in real life. Cheers, Shawn
[WebSimpleDB] Allowing schema operations anywhere
We are finding a number of reasons for wanting to create tables on the fly, and without bumping up the database version. A few examples: - Packaged components that create side tables to maintain its own state - Query processors often need to spill to disk during query execution. For example, sorting large sets requires storing temporary sets of rows on disk to be merged later. So we're thinking it would be better to have these methods directly in the DatabaseSync/DatabaseAsync objects (with proper corresponding patterns), instead of their current location in the Upgrade interface. For the common case where several schema changes need to be done atomically, developers can simply wrap the calls in a transaction, and they would do for regular data manipulation. We would need an extra method to bump up the version explicitly, as that would no longer be in the upgrade callback. Does this seem reasonable? Regards, -pablo