> > > CREATE TABLE breeding (
> > > rec SMALLINT(4) UNSIGNED AUTO_INCREMENT,
> > > sire SMALLINT(4) UNSIGNED,
> > > dam SMALLINT(4) UNSIGNED,
> > > paired DATE,
> > > split DATE,
> > > num_offspring TINYINT(1) ZEROFILL,
> > > PRIMARY KEY (rec)
> > > );
> > >
> > > First, we define if a pair was breeding for a given date by seeing if that
> > > date lies between the paired date and the split date. In mysql:
> > > SELECT FROM breeding WHERE TO_DAYS(paired) <= TO_DAYS('$given_date') AND
> > > TO_DAYS('$given_date') <= TO_DAYS(split);
=This seems part of the problem (as well as part of the solution). Procedural logic
suggests that we should
iterate through the days of the month, taking note of the facts, and then perform the
statistical calculation at
the end. Relational logic doesn't work like this, it requires us to process through
the data in the table(s).
This table is not organised by date, indeed it notes periods of time, which are
notoriously problematic to sort
into a 'date order'! However all is not lost...
> > There is no need to use TO_DAYS() because paired and split are both
> > dates so you can operate on them directly. I recommend that you take a
> > look at the manual's section on Time and Date Functions.
> You are right. I works without the TO_DAYS(). But I did get them from the
> manual :-)
=thanks for the note. It's difficult to know whether to risk insulting someone by
=ironically I think you're going to end up right back there, but other functions of
interest may include Month,
MonthName, Extract, Date_Sub, and even Timestamp.
=general advice would be to define the dates in PHP (with MySQL in mind), but keep all
the date processing in
MySQL (if at all possible).
> > > Where it gets messy is this has to be put into a loop, between the given
> > > start and end dates (year and month only). For each day in the loop, I
> > > have to do a query and count the number returned to php. Then average that
> > > figure for the month.
> > So if I understand correctly, you are thinking in terms of a 'result
> > table' which would (ideally) have the days 1-28/29/30/31 in the
> > left-hand column, and the number of pairs who were put together on
> > that day (paired <= day/date <= split). Then at the bottom you want to
> > sum the right-hand column and divide it by the number of days in the
> > month to give some sort of average mating possibilities per day. Does
> > that sound right?
> Basically for statistical reasons I need average pairs per month, just a
> single number. But the months/years to be queried on would be input via a
> form. I was thinking of generating the data on-the-fly, but as you suggest
> above and from something Frank Flynn suggested, I would be better off
> having a separate table with monthly results. They could all be
> pre-calculated except the current month.
=I noted Frank's suggestion. Obviously no one wants to duplicate 'the calendar'
unnecessarily, but by having a
calendar table, it does open up the possibility of processing through that table (from
start-date to end-date)
and using relational logic to extract (date) 'related' rows from the breeding tbl. In
many ways, when I have
finished, you will recognise that there is some attraction to that approach!
=You may have missed one of the points that I was making: what is the math behind this
calculation? I'm going to
set up some sample data/an approach, but only talk about a 'week' rather than a
'month' for brevity/ease of
comprehension... So if we have breeding pairs A through E, the breeding tbl
(representative) looks like:
sire dam paired split
Am Af Mon Fri
Bm Bf Tue Thu
Cm Cf Tue Fri
Dm Df history Sun
Em Ef Sat future
where m=male, f=female, and we're only interested in the period Mon-Fri, so rows D and
E are included only to
show exclusion (yes, that last clause does make logical sense...I think - they're
included only so that they can
=Our first thought is to think in terms of say "Monday", and to scan the breeding tbl
to count how many pairs
were breeding that day, and make a note of the count; then move to "Tuesday" and do
the same; finally summing
=Thus 'temporal logic' (working one day at a time) says:
Mon = 1 pair
Tue = 3 pairs
Wed = 3 pairs
Thu = 3 pairs
Fri = 2 pairs
Sum = 12
nDays = 5
Avg = 2.5
=Now, crucial question: have I understood the logic of the computation correctly? I
shall blythly carry on
=Ok, so now we're in a quandry because constructing a SQL statement to look at Monday,
the Tuesday, ... is not
possible - enter Frank (trumpet calls offstage, left).
=However there's more than one way to skin a cat, and according to my (admittedly not
fantastic) math ability,
we could arrive at the same numbers by coming at things the other way around! Let's
count the number of days for
which a pair were breeding:
A = 5
B = 3
C = 4
D = ?
E = ?
Sum = 12
nDays = 5
Avg = 2.5
Proceed directly to Go!
=Let's ignore the ? rows for a minute (they're not relevant/inside the Mon-Fri period
Using SUM() from SQL, getting the number of days in the month from Frank or PHP, and a
function we/you will
define later to count days in the period, this is what the SQL statistical calculation
might look like:
SUM( nrDays( split - paired + 1 ) ) / nDays
=but wait a minute, there's no reference to the month (from which we have already used
nDays)!!! Quite right,
and here's where we start to build a headache...
=The split/paired dates may be completely within the required period, completely
outside of it, or may straddle
the period. I've assumed the first, and asked you to ignore the second, so now let's
go straight to the 'great
kahuna' - the only days we are interested in counting are those within the period.
Please let me introduce:
SOM = start of month date, ie 1st of month.
EOM = end of month date (hah! what's this? 28/29/30/31st of month!?)
=If SOM < paired, then we only count from the beginning of the breeding period, ie
paired. However if the
breeding period began prior to SOM then we only count from the first of the month. In
IF( paired < SOM , paired, SOM )
=The relevant period is from the start of the month, or from the beginning of the
pairing, whichever came later
(oh boy, it sounds like a tax calculation form doesn't it?) The same (logic) applies
to the relationship between
EOM and split. Which means that the statistical calculation now looks like:
IF( split > EOM , split, EOM ) -
IF( paired < SOM , paired, SOM ) + 1
) / nDays
=Now let's step back to those three 'period' considerations above. This complicated
SUM() takes care of
situations one and three, but what about situation two (breeding period is irrelevant
to the calculation)? You
had already solved that - SQL uses the WHERE clause to include/exclude relevant rows:
WHERE paired <= start_date AND end_date <= split
=BTW in the case of the period being a calendar month, and further to my comments
about the last day of the
month being anything from the 28th through the 31st, it might be easier to calculate
the first of the next month
and change the last condition to be:
end_date(plus one, ie first of month+1) < split
=If you've followed all that, you're doing well (and exercising considerable
patience). I don't have your data
(which I'll ask you to send privately - NOT to the list) if we have to play around
with this problem much more).
So the time has come for you to 'give it a go, Joe', and make sure that my theory can
actually be applied to
your practical situation. You will find it easier to work in native SQL or use a
'management package', and
substitute suitable values for my 'convenient' variables. Does it work?
=BUT this only works for a single nominated month! I got the impression that you
wanted more than that.
Unfortunately (and this applies to Frank's suggestion too) your data is not actually
'organised' by dates: it is
in fact two dates representing a period. Logically breeding periods and months are not
identical. In order to
repeat an SQL statistical function on groups of data, eg Jan, then Feb; the two groups
need to be mutually
exclusive (AFAIK), eg return a count of all the males, and then all the females, in
the table. Breeding pairs
for Jan, and breeding pairs for Feb, are not mutually exclusive if (using my example
periods from above) Mon =
28Jan and Fri = 1Feb!
> > > Normally start and end dates would be input via a form.
> > I don't understand, at first you talked of "$given_date and now we
> > have two dates! Are you inputting a month - asking for the average for
> > January for example. Or are you putting in two dates representing a
> > period of arbitrary length and asking for an average of that, eg 42
> > days? Please clarify.
> Input is two dates, granularity one month. For example start Jan 2000,
> finish Dec 2001. Then for each month in that period, I want the avarage
> for each month. For example:
> 2000 January 56.2
> 2000 February 34.4
> 2001 December 48.6
=Accordingly you might need to use PHP to break a wider calculation into individual
months, or to store
historical monthly data (if it is called for frequently enough to be worth the
hassle/storage space - or the
volume of data is significant (you mention other stats, below)) cf the response time
to repeatedly call this SQL
query. [perhaps a wiser mind than mine might come up with a solution to this!?]
> There will be a whole lot of other statistics as well, such as number of
> eggs laid, number hatched, number of surviving chicks etc. But I can
> calculate _them_ quite easily.
=I've also left you with the task of counting nrDays (the number of relevant days
within the breeding period),
and working our nDays (the number of days in the month) for the 'average' calculation.
These will depend upon
your preferences. We're back to the TO_DAYS() idea or even using Timestamp data types.
Also don't forget what I
said about the 'add one month and subtract one day' idea above which might fit with
the DATE_SUB() - different
minds prefer to approach this sort of logic in different ways, so I'll leave that to
your preference - you
already think my mind is warped, without me giving you yet more good reason to confirm!
> > Trouble is it's after midnight here. Apologies. Please would you
> > respond to the questions above, and if no one else steps in meantime,
> > I'll endeavor to return with a more lucid analysis in the morning...
> Sorry to keep you awake :-)
=I had replied during the morning yesterday (Z-time), as today. It wasn't the "keeping
awake", no one was going
to achieve that, it was me trying to get the logical cogs into the right order -
without stripping too many
gears in the process... Re-reading, I see that I had the ambition of being "lucid". I
trust that at least this
=Let me know how you get on,
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]