Re: [Maria-developers] Phone home

2010-09-10 Thread Sergei Golubchik
Hi, Adam!

On Sep 09, Adam M. Dutko wrote:
  So, Phone Home or MySQL feedback daemon or better name wanted
  feature.
 
 Maybe call it Butler  ??? Just a thought...

:)
Why?
 
  Not unlike the Uptimes Project or Debian Popularity Contest.

 Opt-in only with an easy disable option after opting in... correct?

Of course.

Sorry, I didn't make it clear enough - the first email was only about
questions, unclear moments in this task. Whether it should be opt-in is
not one of them :)
 
  The complete specs will be here:
  http://askmonty.org/worklog/Server-Sprint/?tid=12
 
 I imagine the following ...
 
   (optionally by user) geographic location
   (optionally by user) user information / company name
   (optionally by user) Monty Program Ab customer support contract id
 
 won't be shown to everyone, correct?  So maybe a filtered public
 versus unfiltered private view?

Of course.
 
  1. Should that be a MariaDB plugin or a separate executable ?
 
 A separate executable would probably be the best for the reasons you
 highlight in your first paragraph.  The drawbacks are probably covered by
 the fact that 1) if a user is having that awful of a time, they are probably
 able to step through the executing code or 2) the user probably has a
 support contract with a company that can step through the code and debug the
 problem.  Granted more in depth statistics would be useful, but maybe it
 would make sense to have a separate project to create a loadable module that
 would be more invasive.  This tool seems to be oriented towards usage and
 usage related data, not necessarily troubleshooting/fixing.

Right.
 
  2. How to send the data.
 
 I imagine if the code is generated with this in mind it should be easy
 to switch out the transport (read transmission method) layer at a
 later time.  Unless the person coding it really ties the data
 formatting and submission process to the protocol.

Right.
 
 3. Auditing.
 
 I think the proxy idea, as well as the wget mode are great ideas.
 If the user isn't paranoid and doesn't want to sniff traffic one
 could also provide a log of all activities and a separate log for all
 messages.

Yes. I was trying to find something convincing for paranoid users (like
me :). Normal users can just look in the log.
 
  4. What to report.
 
   hardware: CPU, RAM
 
 maybe disk speeds? and type?  (SATA vs SAS vs IDE)

Good idea. Indeed, it's important.
And to know if it's SSD or not.
 
   OS (linux distribution, kernel)
 
 any libraries?

I don't know.
As you said it's not to troubleshoot, it's to steer development.

I don't know if we may want to optimize for a specific version of a
specific library. And if yes - for what library?
 
   number of databases, max/avg number of tables in a database,
 
 the slightly insane might also run multiple instances on a single
 machine, so what about checking for other installations?

Right.
 
 Just a few thoughts, hopefully they're not distracting or useless.

Not at all!
Thanks for sharing them.

Regards,
Sergei

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Phone home

2010-09-10 Thread Adam M. Dutko

 :)
 Why?


When I think of a Butler I think of someone who monitors various aspects of
a household/estate, stashes that information and uses it to improve service.


 Good idea. Indeed, it's important.
 And to know if it's SSD or not.


Last night I was also thinking about network configuration.  It might be
good to know if people are using the database over the network more often
than a standalone with BindAddress 127.0.0.1.  It might also be good to know
the distribution of NIC speeds (10/100/1000/1) as it might help when
determining where to focus development efforts.  That is, if a ton of people
are using 10Mbps (unlikely) maybe it might be useful to look at improving
compression or other data related parts?

I don't know if we may want to optimize for a specific version of a
 specific library. And if yes - for what library?


I imagine the MariaDB version will determine what libraries people have
installed because of various dependencies, but it might be useful to collect
that information as well or whether they're running custom C libraries
versus stock and etc because this might point out areas with problems for
high-end users.  I'm not familiar enough with the code base to know which
ones MariaDB might want to monitor, I just thought it might be useful to
think on it some more...
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Phone home

2010-09-09 Thread Sergei Golubchik
Hi.

So, Phone Home or MySQL feedback daemon or better name wanted
feature.
It is something that can be installed together with MariaDB, it will
gather different statistic about how MariaDB is used and will send this
information anonymously to mariadb.org.

Not unlike the Uptimes Project or Debian Popularity Contest.

The complete specs will be here:
http://askmonty.org/worklog/Server-Sprint/?tid=12

There are basically four questions I'm thinking on.

1. Should that be a MariaDB plugin or a separate executable ?

I tend to prefer a separate executable. There is no need to keep it in
memory constantly - cron job can do. Being separate its bugs won't
affect the server. Being separate one instance can monitor many MariaDB
servers. It can be upgraded separately - and it's not tied to the server
release schedule.

The drawback - it won't be able to grab MariaDB internals easily, which
means it may not report some data that are worth reporting. But to solve
this we can add an I_S table that provides this information. This way
there's no hidden data to report, everything is available from the
SQL. Which is good :)

2. How to send the data.

We'll use HTTP. Seems to be the most universally working transport.
That's what other projects are using too - Uptimes Project uses UDP or
HTTP, Debian Popularity Contest - SMTP or HTTP.

We *may* want to add SMTP later, if needed.

3. Auditing.

How can we prove to paranoid users that we only send what we are saying
we send, and none of potentially private information.

Possible solution:

  http sending should support a proxy (to work behind firewalls), so one
  can install a logging proxy and record all the data sent. On the other
  hand, we'd like to use SSL too.

  We can support, besides direct http, a wget mode where the data are
  sent by invoking wget (which supports proxies, SSL and --post-file)
  and one could easily replace wget with a simple script that logs
  all the data.

4. What to report.

That's the most interesting part :)

note that not everything from below is collected in MariaDB now, but I
describe the ideal case, what would be useful to know to steer MariaDB
development in the right direction.

The principle I used was not let's grab as much as we can but on a
need-to-know basis. For example, we may need to decide whether to
optimize huge IN (...) lists or GIS first. Knowing what is used more
often would help to make a correct decision.

 hardware: CPU, RAM
 OS (linux distribution, kernel)
 mariadb version, memory usage
 parts of config (e.g. buffer sizes)
 list of installed plugins
 number of databases, max/avg number of tables in a database,
 max/avg db/table size
 uptime
 something that indicates the load, e.g. average qps
 how much a particular feature is used:
   Com_ counters from SHOW STATUS
   plugin usage counters
   per feature, like GIS, replication, etc.
   per query parts, like ORDER BY, subquery in the FROM, IN subquery ...
   how useful is query cache (hit ratio?)

What else ?

Regards,
Sergei

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Phone home

2010-09-09 Thread Adam M. Dutko
 So, Phone Home or MySQL feedback daemon or better name wanted
 feature.


Maybe call it Butler  ??? Just a thought...

Not unlike the Uptimes Project or Debian Popularity Contest.


Opt-in only with an easy disable option after opting in... correct?


 The complete specs will be here:
 http://askmonty.org/worklog/Server-Sprint/?tid=12


I imagine the following ...

  (optionally by user) geographic location
  (optionally by user) user information / company name
  (optionally by user) Monty Program Ab customer support contract id

won't be shown to everyone, correct?  So maybe a filtered public versus
unfiltered private view?


 1. Should that be a MariaDB plugin or a separate executable ?


A separate executable would probably be the best for the reasons you
highlight in your first paragraph.  The drawbacks are probably covered by
the fact that 1) if a user is having that awful of a time, they are probably
able to step through the executing code or 2) the user probably has a
support contract with a company that can step through the code and debug the
problem.  Granted more in depth statistics would be useful, but maybe it
would make sense to have a separate project to create a loadable module that
would be more invasive.  This tool seems to be oriented towards usage and
usage related data, not necessarily troubleshooting/fixing.


 2. How to send the data.


I imagine if the code is generated with this in mind it should be easy to
switch out the transport (read transmission method) layer at a later
time.  Unless the person coding it really ties the data formatting and
submission process to the protocol.

3. Auditing.


I think the proxy idea, as well as the wget mode are great ideas.  If the
user isn't paranoid and doesn't want to sniff traffic one could also
provide a log of all activities and a separate log for all messages.


 4. What to report.

  hardware: CPU, RAM


maybe disk speeds? and type?  (SATA vs SAS vs IDE)


  OS (linux distribution, kernel)


any libraries?


  number of databases, max/avg number of tables in a database,


the slightly insane might also run multiple instances on a single machine,
so what about checking for other installations?



Just a few thoughts, hopefully they're not distracting or useless.

-Adam
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp