Garry, > > > CREATE TABLE breeding ( > > > rec SMALLINT(4) UNSIGNED AUTO_INCREMENT, > > > sire SMALLINT(4) UNSIGNED, > > > dam SMALLINT(4) UNSIGNED, > > > paired DATE, > > > split DATE, > > > num_offspring TINYINT(1) ZEROFILL, > > > PRIMARY KEY (rec) > > > ); > > > > > > First, we define if a pair was breeding for a given date by seeing if that > > > date lies between the paired date and the split date. In mysql: > > > SELECT FROM breeding WHERE TO_DAYS(paired) <= TO_DAYS('$given_date') AND > > > TO_DAYS('$given_date') <= TO_DAYS(split);

=This seems part of the problem (as well as part of the solution). Procedural logic suggests that we should iterate through the days of the month, taking note of the facts, and then perform the statistical calculation at the end. Relational logic doesn't work like this, it requires us to process through the data in the table(s). This table is not organised by date, indeed it notes periods of time, which are notoriously problematic to sort into a 'date order'! However all is not lost... > > There is no need to use TO_DAYS() because paired and split are both > > dates so you can operate on them directly. I recommend that you take a > > look at the manual's section on Time and Date Functions. > You are right. I works without the TO_DAYS(). But I did get them from the > manual :-) =thanks for the note. It's difficult to know whether to risk insulting someone by saying "RTFM"! =ironically I think you're going to end up right back there, but other functions of interest may include Month, MonthName, Extract, Date_Sub, and even Timestamp. =general advice would be to define the dates in PHP (with MySQL in mind), but keep all the date processing in MySQL (if at all possible). > > > Where it gets messy is this has to be put into a loop, between the given > > > start and end dates (year and month only). For each day in the loop, I > > > have to do a query and count the number returned to php. Then average that > > > figure for the month. > > So if I understand correctly, you are thinking in terms of a 'result > > table' which would (ideally) have the days 1-28/29/30/31 in the > > left-hand column, and the number of pairs who were put together on > > that day (paired <= day/date <= split). Then at the bottom you want to > > sum the right-hand column and divide it by the number of days in the > > month to give some sort of average mating possibilities per day. Does > > that sound right? > Basically for statistical reasons I need average pairs per month, just a > single number. But the months/years to be queried on would be input via a > form. I was thinking of generating the data on-the-fly, but as you suggest > above and from something Frank Flynn suggested, I would be better off > having a separate table with monthly results. They could all be > pre-calculated except the current month. =I noted Frank's suggestion. Obviously no one wants to duplicate 'the calendar' unnecessarily, but by having a calendar table, it does open up the possibility of processing through that table (from start-date to end-date) and using relational logic to extract (date) 'related' rows from the breeding tbl. In many ways, when I have finished, you will recognise that there is some attraction to that approach! =You may have missed one of the points that I was making: what is the math behind this calculation? I'm going to set up some sample data/an approach, but only talk about a 'week' rather than a 'month' for brevity/ease of comprehension... So if we have breeding pairs A through E, the breeding tbl (representative) looks like: sire dam paired split Am Af Mon Fri Bm Bf Tue Thu Cm Cf Tue Fri Dm Df history Sun Em Ef Sat future where m=male, f=female, and we're only interested in the period Mon-Fri, so rows D and E are included only to show exclusion (yes, that last clause does make logical sense...I think - they're included only so that they can be excluded!) =Our first thought is to think in terms of say "Monday", and to scan the breeding tbl to count how many pairs were breeding that day, and make a note of the count; then move to "Tuesday" and do the same; finally summing and averaging. =Thus 'temporal logic' (working one day at a time) says: Mon = 1 pair Tue = 3 pairs Wed = 3 pairs Thu = 3 pairs Fri = 2 pairs Sum = 12 nDays = 5 Avg = 2.5 =Now, crucial question: have I understood the logic of the computation correctly? I shall blythly carry on regardless... =Ok, so now we're in a quandry because constructing a SQL statement to look at Monday, the Tuesday, ... is not possible - enter Frank (trumpet calls offstage, left). =However there's more than one way to skin a cat, and according to my (admittedly not fantastic) math ability, we could arrive at the same numbers by coming at things the other way around! Let's count the number of days for which a pair were breeding: A = 5 B = 3 C = 4 D = ? E = ? Sum = 12 nDays = 5 Avg = 2.5 Proceed directly to Go! =Let's ignore the ? rows for a minute (they're not relevant/inside the Mon-Fri period under consideration). Using SUM() from SQL, getting the number of days in the month from Frank or PHP, and a function we/you will define later to count days in the period, this is what the SQL statistical calculation might look like: SUM( nrDays( split - paired + 1 ) ) / nDays =but wait a minute, there's no reference to the month (from which we have already used nDays)!!! Quite right, and here's where we start to build a headache... =The split/paired dates may be completely within the required period, completely outside of it, or may straddle the period. I've assumed the first, and asked you to ignore the second, so now let's go straight to the 'great kahuna' - the only days we are interested in counting are those within the period. Please let me introduce: SOM = start of month date, ie 1st of month. EOM = end of month date (hah! what's this? 28/29/30/31st of month!?) =If SOM < paired, then we only count from the beginning of the breeding period, ie paired. However if the breeding period began prior to SOM then we only count from the first of the month. In SQL: IF( paired < SOM , paired, SOM ) =The relevant period is from the start of the month, or from the beginning of the pairing, whichever came later (oh boy, it sounds like a tax calculation form doesn't it?) The same (logic) applies to the relationship between EOM and split. Which means that the statistical calculation now looks like: SUM( nrDays( IF( split > EOM , split, EOM ) - IF( paired < SOM , paired, SOM ) + 1 ) ) / nDays =Now let's step back to those three 'period' considerations above. This complicated SUM() takes care of situations one and three, but what about situation two (breeding period is irrelevant to the calculation)? You had already solved that - SQL uses the WHERE clause to include/exclude relevant rows: WHERE paired <= start_date AND end_date <= split =BTW in the case of the period being a calendar month, and further to my comments about the last day of the month being anything from the 28th through the 31st, it might be easier to calculate the first of the next month and change the last condition to be: end_date(plus one, ie first of month+1) < split =If you've followed all that, you're doing well (and exercising considerable patience). I don't have your data (which I'll ask you to send privately - NOT to the list) if we have to play around with this problem much more). So the time has come for you to 'give it a go, Joe', and make sure that my theory can actually be applied to your practical situation. You will find it easier to work in native SQL or use a 'management package', and substitute suitable values for my 'convenient' variables. Does it work? =BUT this only works for a single nominated month! I got the impression that you wanted more than that. Unfortunately (and this applies to Frank's suggestion too) your data is not actually 'organised' by dates: it is in fact two dates representing a period. Logically breeding periods and months are not identical. In order to repeat an SQL statistical function on groups of data, eg Jan, then Feb; the two groups need to be mutually exclusive (AFAIK), eg return a count of all the males, and then all the females, in the table. Breeding pairs for Jan, and breeding pairs for Feb, are not mutually exclusive if (using my example periods from above) Mon = 28Jan and Fri = 1Feb! > > > Normally start and end dates would be input via a form. > > I don't understand, at first you talked of "$given_date and now we > > have two dates! Are you inputting a month - asking for the average for > > January for example. Or are you putting in two dates representing a > > period of arbitrary length and asking for an average of that, eg 42 > > days? Please clarify. > Input is two dates, granularity one month. For example start Jan 2000, > finish Dec 2001. Then for each month in that period, I want the avarage > for each month. For example: > 2000 January 56.2 > 2000 February 34.4 > .... > 2001 December 48.6 =Accordingly you might need to use PHP to break a wider calculation into individual months, or to store historical monthly data (if it is called for frequently enough to be worth the hassle/storage space - or the volume of data is significant (you mention other stats, below)) cf the response time to repeatedly call this SQL query. [perhaps a wiser mind than mine might come up with a solution to this!?] > There will be a whole lot of other statistics as well, such as number of > eggs laid, number hatched, number of surviving chicks etc. But I can > calculate _them_ quite easily. =I've also left you with the task of counting nrDays (the number of relevant days within the breeding period), and working our nDays (the number of days in the month) for the 'average' calculation. These will depend upon your preferences. We're back to the TO_DAYS() idea or even using Timestamp data types. Also don't forget what I said about the 'add one month and subtract one day' idea above which might fit with the DATE_SUB() - different minds prefer to approach this sort of logic in different ways, so I'll leave that to your preference - you already think my mind is warped, without me giving you yet more good reason to confirm! > > Trouble is it's after midnight here. Apologies. Please would you > > respond to the questions above, and if no one else steps in meantime, > > I'll endeavor to return with a more lucid analysis in the morning... > Sorry to keep you awake :-) =I had replied during the morning yesterday (Z-time), as today. It wasn't the "keeping awake", no one was going to achieve that, it was me trying to get the logical cogs into the right order - without stripping too many gears in the process... Re-reading, I see that I had the ambition of being "lucid". I trust that at least this is comprehensible... =Let me know how you get on, =dn -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]