Hi All, There was a thread recently about test data for Drizzle and while there are lots of sample data options, I was thinking about what data could actually serve the Drizzle community with valuable information. I'd like to propose we create a simple model to record client/server/instance data and volumes of Drizzle and MySQL compatible environments.
The reason for considering this is two fold. - First it's extremely easy information to generate and automate, having machine generated content over user generated content is far easier to scale. - Second it can provide some interesting output for Drizzle stats, e.g. what versions are used, what volume of data, some status variable usages etc. Let me start by saying I'm not advocating that you store your MySQL/Drizzle status variables in tables for generally monitoring in a production environment. * Logical Data Model* A high level quick analysis Client (Id,EmailMD5,Token) - We enable an anonymous approach so people will never actually know the clients Instance (InstanceId, ClientId, product, version, OS, serverAttributes, geoAttributes) Status (InstanceId,Date/Time,name,value) Variables (InstanceId, Date/Time, name, value) Attributes (InstanceId, Date/Time, name, value) - A generic bucket for other important figures including installed storage engines/plugins, number of schemas/tables/procs/functions/triggers etc) Volume (InstanceId, Date/Time, schemas,tables,total_volume,largest_table etc) - Some general and optional metrics of db size There is obviously much more that can be considered such as Server for multiple Instance environments, historical instance changes changes such as version upgrades/downgrades etc (initially it would be more a dumb match). The first goal is not to be perfect but part of continual improvement. *Data Acquisition* >From Drizzle and MySQL 5.1 we can obtain the data via SQL statements. Pre MySQL 51, we can obtain via mysqladmin and load scripts. I'd like to see how we can use Gearman in some interesting way as a collection agent. * Example SQL* - Product/Version Counts (for graphing) - Distribution of server uptimes - Building summary reporting tables * Your Input* While I consider the design of version 1 of tables will take only a few hours I'd like to know if people would consider this an interesting example to pursue. There is also opportunity for others to contribute to data acquistion SQL/scripts, example output, even UI. Give the very simple model we can also consider what sharding of data you may consider for a more cloud based solution. Several years ago I actually started on a related product, called DBCollation.org. My goal was to build statistics about MySQL instances world wide, so we could produce some interesting statistics/graphs etc of usage of MySQL. I think it would be great on Drizzle.org to see some actual stats of Drizzle systems. Granted initially it may be lame in numbers/volumes and perhaps needs to be more private/internal, it enables participation. Regards Ronald
_______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

