Hi everyone, Having developed for Android, I have some thoughts about the content providers API which I think might be worth sharing; I believe there is room for some improvements to be made to the content providers API before it is stabilized and would be interested to hear if there are other developers who feel the same way.
What's wrong about content providers, you ask? Well, as the API docs put it, content providers are meant to "encapsulate data and provide it to applications through the single ContentResolver interface. A content provider is only required if you need to share data between multiple applications". So, content providers are meant to allow data sharing between applications, and as such they must perform some encapsulation over the data. The question is - to what degree are the content providers meant to abstract over the data representation & storage mechanism? This is important because it deeply affects the way the content providers API is designed, and thus on anyone writing or using content providers. And it doesn't seem to me that the API, as it's currently designed, makes a clear choice on this issue. On one hand, while most of the content providers rely on an SQLite DB for their data storage, the API does seem to be designed in an attempt to abstract over the SQL database. This has various implications, such as: 1. Not allowing natural usage of join queries, even in cases where it might make sense. The process performing the query is at the mercy of the implementor of the content provider - did she choose to allow for a specific join case or not? And of course, the implementor can only support joins in an ad-hoc way, e.g. by supplying content addresses which represent specific cases of joins, or by detecting columns which belong in different tables in the 'projection' argument and adding the respective table and adjusting the SQL query correspondingly. Naturally, not every reasonable usage can be accommodated for this way. (Personally, I've bumped into such limitations when trying to perform some non-trivial queries on Android's built-in contacts database). 2. Compiled SQL statements can't be used, reducing efficiency for repeating queries. 3. Some useful SQL constructs (e.g. SELECT DISTINCT) aren't accessible. 4. Only string placeholders are allowed. Direct access to the SQLite database would naturally make all these limitations void (IPC issues aside, for a moment). In general, putting the content provider in the accessing process' way to the data hurts both sides: The accessing process loses flexibility and efficiency and the content provider needs to perform various manipulations and parsing actions in order to generate a proper SQL query to hand the DB. And because the columns and table names are, by convention, visible as constants, there is actually not much abstraction going on in most cases. Of course, we do gain inherent data sharing & lookup (courtesy of the ContentResolver), and we do want a content provider to validate all modifying operations, but the current API is simply a bit limiting. On the other hand, there seems to be an implicit assumption that all data sources are indeed backed by an SQL database (specifically, by SQLite). E.g., say we're implementing a non-SQLite-backed content provider. What shall we do with the 'selection' parameter of the query() method, which is effectively a WHERE clause? As far as I can see, our options are mainly: 1. Implement the WHERE parsing & filtering logic by ourselves - would require some work for us to reinvent the wheel. 2. Ignore the WHERE clause and instead perhaps give some REST-like URIs for some specific specialized queries we predict usage for - like is done i.e. in apps-for-android. This is fine, but obviously creates a less consistent & pleasant experience for developers using our content provider. 3. Apply the query (including the WHERE clause) on a temporary memory- backed SQLite DB built for this purpose. This would allow flexible queries, but would probably be an inefficient overkill on a mobile phone. Also, our data model might not fit the relational model so well. 4. Specify a different, simpler format for the 'selection' parameter. Easy to do, but inconsistent with the API and less nice for a developer using our content provider. The rest of the content provider API isn't particularly adequate for non-DB-backed content providers as well, i.e. the whole notion of 'columns' and 'projection' isn't necessarily relevant for a document- centric data store. I think there are many cases where we wouldn't want a content provider to rely on an SQLite DB - it might rely on a remote data source, an RDF triple store, some other storage mechanism which is optimized for some specific type of data, etc... Now, if we're building an SQLite-backed content provider, we can just hand the WHERE clause as-is to SQLite, but then a user-supplied parameter would rely on a specific table structure, negating much of the abstraction's power (if we change our data scheme in the future, we'll have to perform some preprocessing before handing the query to SQLite)... Additionally, our data scheme is pretty much exposed anyway, as we will (by convention and for convenient usage) define constants for the columns, and we're probably just taking the user's other query parameters (projection, selection arguments & sort order) and handing them more-or-less verbatim to SQLite as well. What then do we gain by limiting the developer using our content provider to the API of the ContentResolver's query() method, except data sharing & lookup? We can provide the missing features mentioned above and still maintain the same abstraction level. So, to summarize my view - the content providers API isn't designed in a way that makes a clear choice about the abstraction level. True SQL- bound content providers lose efficiency & convenience while exposing quite a bit of their data scheme, and non-SQL-bound content providers must bend to fit to the required API, which might result in an inconsistent usage experience for a developer accessing that content provider. I think this can be remedied in several ways, including: A. Offer a basic, generic content provider mechanism, where a specific content provider can implement one (or possibly more than one) of a few pre-defined interfaces - e.g. the two basic and most important interfaces would probably be one that is SQL-centric and one that is document-centric. In future versions of the platform, additional interfaces may be added - e.g. a triple-store centric one. This option introduces the developer to additional content provider interfaces to work against, but I think it's better than having a single content provider interface that is limited and behaves differently for different content providers. B. Enrich the content provider API in a way that allows efficient usage of SQLite, but allows non-SQLite-backed content providers to just ignore the extra methods / parameters (I don't think this is elegant, but it will work). C. Some other creative way that someone here would suggest :-) I hope there's some Google Android team member reading this... Anyone has thoughts about this? Itamar Rogel, Briox http://www.briox.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] Announcing the new M5 SDK! http://android-developers.blogspot.com/2008/02/android-sdk-m5-rc14-now-available.html For more options, visit this group at http://groups.google.com/group/android-developers?hl=en -~----------~----~----~----~------~----~------~--~---

