I'm trying to create a rather (what I consider to be) complex analysis query.
It should use data from three tables that contain measurements and write the
results back to a fourth table.
There are three measurement tables: metrica, metricb, and metricc. There's one
table which contains a summary of the results of the analysis: zone
Here are the table descriptions:
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| zone | int(10) unsigned | NO | PRI | NULL | auto_increment |
| zonename | varchar(45) | YES | | NULL | |
| metrica | double | NO | | 0 | |
| metricb | double | NO | | 0 | |
| metricc | double | NO | | 0 | |
+----------+------------------+------+-----+---------+----------------+
All of the metric tables follow the same format. (maybe I should group them
into one table?)
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| metric_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| zone | int(10) unsigned | NO | PRI | NULL | |
| metric | int(11) | NO | | 0 | |
+-----------+------------------+------+-----+---------+----------------+
There is a "test zone" with zone = 0. I want to compare this zone to all the
others based on the three metrics and store the similarity factor to the metric
column for that zone in the zone table.
To produce the similarity factor for a given metric in an arbitrary zone, n, I
want to count all metric_id's where the metric is posititve in both zone n and
zone 0 or negative in both. I'll take this sum and divide it by the count of
all metric_id's with non-zero metrics for zone 0. I want to store the result of
this calculation to zone n's entry in the zone table.
Ideally, I'd like to do this all in a large query.
I would imagine that the query would look something like this, but my count
syntax is completely wrong and I know that I'm missing some join and grouping
syntax:
INSERT INTO zone SELECT COUNT(
SELECT a.metric_id FROM metrica AS a JOIN metrica AS b ON metric_id WHERE
a.zone=0 AND b.zone<>0 GROUP BY metric_id HAVING (a.metric>0 AND b.metric>0) OR
(a.metric<0 AND b.metric<0)
) / COUNT(SELECT a.metric_id FROM metrica WHERE metric<>0),
{{repeats similar syntax for remaining columns}} ON DUPLICATE KEY UPDATE
metrica=VALUES(metrica), etc...
How would I go about implementing this query so that it actually works?
Thanks,
Brendan