[Maria-developers] WL#143 New (by Sergei): full-text search engine plugin
--- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...: full-text search engine plugin CREATION DATE..: Thu, 09 Sep 2010, 07:38 SUPERVISOR.: Sergei IMPLEMENTOR: COPIES TO..: CATEGORY...: Server-RawIdeaBin TASK ID: 143 (https://askmonty.org/worklog/?tid=143) VERSION: WorkLog-4.0 STATUS.: Un-Assigned PRIORITY...: 80 WORKED HOURS...: 0 ESTIMATE...: 320 (hours remain) ORIG. ESTIMATE.: 320 PROGRESS NOTES: DESCRIPTION: A new plugin type - full-text search engine. Ideally, it'll allow to add full-text search to any table, independently from the storage engine, providing fully integrated SQL syntax, still allowing to chose different underlying FTS implementations ESTIMATED WORK TIME ESTIMATED COMPLETION DATE --- WorkLog (v4.0.0) ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] WL#144 New (by Sergei): query rewrite api
--- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...: query rewrite api CREATION DATE..: Thu, 09 Sep 2010, 08:14 SUPERVISOR.: IMPLEMENTOR: COPIES TO..: CATEGORY...: Server-RawIdeaBin TASK ID: 144 (https://askmonty.org/worklog/?tid=144) VERSION: WorkLog-4.0 STATUS.: Un-Assigned PRIORITY...: 60 WORKED HOURS...: 0 ESTIMATE...: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: DESCRIPTION: An API for query rewrites. It may be a special rewrite plugin or part of the storage engine or some other API. Preferably it should not force plugin to parse or work with the sql string, but it may provide a DOM-like representation of the query and let the plugin to manipulate the tree nodes. Perhaps this needs the Abstract Query Tree task to be done first. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE --- WorkLog (v4.0.0) ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] WL#145 New (by Sergei): user defined data types
--- WORKLOG TASK -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- TASK...: user defined data types CREATION DATE..: Thu, 09 Sep 2010, 08:46 SUPERVISOR.: IMPLEMENTOR: COPIES TO..: CATEGORY...: Server-RawIdeaBin TASK ID: 145 (https://askmonty.org/worklog/?tid=145) VERSION: WorkLog-4.0 STATUS.: Un-Assigned PRIORITY...: 40 WORKED HOURS...: 0 ESTIMATE...: 0 (hours remain) ORIG. ESTIMATE.: 0 PROGRESS NOTES: DESCRIPTION: soften we get requests for new data types like timestamps with microsecond precision or ipv4/ipv6. This could be solved with user defined types - implemented via plugins or special SQL syntax, whatever is appropriate. ESTIMATED WORK TIME ESTIMATED COMPLETION DATE --- WorkLog (v4.0.0) ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
[Maria-developers] Phone home
Hi. So, Phone Home or MySQL feedback daemon or better name wanted feature. It is something that can be installed together with MariaDB, it will gather different statistic about how MariaDB is used and will send this information anonymously to mariadb.org. Not unlike the Uptimes Project or Debian Popularity Contest. The complete specs will be here: http://askmonty.org/worklog/Server-Sprint/?tid=12 There are basically four questions I'm thinking on. 1. Should that be a MariaDB plugin or a separate executable ? I tend to prefer a separate executable. There is no need to keep it in memory constantly - cron job can do. Being separate its bugs won't affect the server. Being separate one instance can monitor many MariaDB servers. It can be upgraded separately - and it's not tied to the server release schedule. The drawback - it won't be able to grab MariaDB internals easily, which means it may not report some data that are worth reporting. But to solve this we can add an I_S table that provides this information. This way there's no hidden data to report, everything is available from the SQL. Which is good :) 2. How to send the data. We'll use HTTP. Seems to be the most universally working transport. That's what other projects are using too - Uptimes Project uses UDP or HTTP, Debian Popularity Contest - SMTP or HTTP. We *may* want to add SMTP later, if needed. 3. Auditing. How can we prove to paranoid users that we only send what we are saying we send, and none of potentially private information. Possible solution: http sending should support a proxy (to work behind firewalls), so one can install a logging proxy and record all the data sent. On the other hand, we'd like to use SSL too. We can support, besides direct http, a wget mode where the data are sent by invoking wget (which supports proxies, SSL and --post-file) and one could easily replace wget with a simple script that logs all the data. 4. What to report. That's the most interesting part :) note that not everything from below is collected in MariaDB now, but I describe the ideal case, what would be useful to know to steer MariaDB development in the right direction. The principle I used was not let's grab as much as we can but on a need-to-know basis. For example, we may need to decide whether to optimize huge IN (...) lists or GIS first. Knowing what is used more often would help to make a correct decision. hardware: CPU, RAM OS (linux distribution, kernel) mariadb version, memory usage parts of config (e.g. buffer sizes) list of installed plugins number of databases, max/avg number of tables in a database, max/avg db/table size uptime something that indicates the load, e.g. average qps how much a particular feature is used: Com_ counters from SHOW STATUS plugin usage counters per feature, like GIS, replication, etc. per query parts, like ORDER BY, subquery in the FROM, IN subquery ... how useful is query cache (hit ratio?) What else ? Regards, Sergei ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] Phone home
So, Phone Home or MySQL feedback daemon or better name wanted feature. Maybe call it Butler ??? Just a thought... Not unlike the Uptimes Project or Debian Popularity Contest. Opt-in only with an easy disable option after opting in... correct? The complete specs will be here: http://askmonty.org/worklog/Server-Sprint/?tid=12 I imagine the following ... (optionally by user) geographic location (optionally by user) user information / company name (optionally by user) Monty Program Ab customer support contract id won't be shown to everyone, correct? So maybe a filtered public versus unfiltered private view? 1. Should that be a MariaDB plugin or a separate executable ? A separate executable would probably be the best for the reasons you highlight in your first paragraph. The drawbacks are probably covered by the fact that 1) if a user is having that awful of a time, they are probably able to step through the executing code or 2) the user probably has a support contract with a company that can step through the code and debug the problem. Granted more in depth statistics would be useful, but maybe it would make sense to have a separate project to create a loadable module that would be more invasive. This tool seems to be oriented towards usage and usage related data, not necessarily troubleshooting/fixing. 2. How to send the data. I imagine if the code is generated with this in mind it should be easy to switch out the transport (read transmission method) layer at a later time. Unless the person coding it really ties the data formatting and submission process to the protocol. 3. Auditing. I think the proxy idea, as well as the wget mode are great ideas. If the user isn't paranoid and doesn't want to sniff traffic one could also provide a log of all activities and a separate log for all messages. 4. What to report. hardware: CPU, RAM maybe disk speeds? and type? (SATA vs SAS vs IDE) OS (linux distribution, kernel) any libraries? number of databases, max/avg number of tables in a database, the slightly insane might also run multiple instances on a single machine, so what about checking for other installations? Just a few thoughts, hopefully they're not distracting or useless. -Adam ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp