On Tue, Sep 18, 2012 at 4:01 PM, Rick James <rja...@yahoo-inc.com> wrote:
> SELECT mn.*
>        mx.*
>> >    FROM ( SELECT @min = MIN(bottom),
>> >                  @max = MAX(bottom), ... )
>> >    JOIN data_cst mn  ON bottom = @min
>> >    JOIN data_cst mx  ON bottom = @max
>
> That is, the FROM finds the value for the desired row.
> The JOIN then gets to the rest of the fields.
> Caveat:  If two rows have bottom = @min, it returns both.


Thanks a lot Rick! This was super helpful. I've got it working for 3
of the 4 cases - min(bottom), max(bottom), and max(date_time). But I
can't figure out how to work in the last case - where bottom is
closest to avg(bottom). In an individual query I can get it with an
order by, like this:

SELECT rollup.Target, rollup.EP, rollup.`Avg Bottom`, data_cst.id,
data_cst.bottom
FROM data_cst, rollup, data_target
WHERE data_target.name = rollup.Target
AND data_cst.ep = rollup.EP
AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND '2010-03-04 23:59:59'
AND data_target.id = data_cst.target_name_id
GROUP BY rollup.Target, rollup.EP
ORDER BY ABS(data_cst.bottom - rollup.`Avg Bottom`);

Any way to work that into another join?

Thanks!

>
>> -----Original Message-----
>> From: Larry Martell [mailto:larry.mart...@gmail.com]
>> Sent: Tuesday, September 18, 2012 12:54 PM
>> To: Rick James
>> Cc: peter.braw...@earthlink.net; mysql@lists.mysql.com
>> Subject: Re: getting certain rows from a group by
>>
>> On Tue, Sep 18, 2012 at 2:05 PM, Rick James <rja...@yahoo-inc.com>
>> wrote:
>> > For single query...
>> >
>> > Plan A:
>> > Repeat the SELECT as once for each (MIN, AVG, etc) as a subquery in
>> >    WHERE bottom = ( SELECT MIN(bottom) ... )
>> >
>> > Plan B:
>> >    FROM ( SELECT @min = MIN(bottom),
>> >                  @max = MAX(bottom), ... )
>> >    JOIN data_cst mn  ON bottom = @min
>> >    JOIN data_cst mx  ON bottom = @max
>>
>>
>> Can you elaborate on this? I don't see how this will give me what I
>> need. I'm not looking for the min or max bottom - I already have that
>> - I'm looking for the row from each group that has the min and max
>> bottom.
>>
>> > Plan C:
>> > Get rid of 3rd party packages that eventually get in your way instead
>> of 'helping'.
>>
>> It's not django that is getting in the way. django is a fantastic
>> framework for web development. It's just changing requirements. The
>> original developer (who no longer works here) created a few different
>> classes, and then based on the original requirements, chose to use the
>> one that supports a single query. That worked then, but may not work
>> now. There are other classes that allow multiple queries, but then you
>> have to parse the data and stuff it into the context that will be
>> picked up by the browser yourself. The single query class does that for
>> you. I may yet go that way, but I'm trying to avoid a big rewrite if I
>> can.
>>
>> >> -----Original Message-----
>> >> From: Larry Martell [mailto:larry.mart...@gmail.com]
>> >> Sent: Tuesday, September 18, 2012 4:57 AM
>> >> To: peter.braw...@earthlink.net
>> >> Cc: mysql@lists.mysql.com
>> >> Subject: Re: getting certain rows from a group by
>> >>
>> >> On Tue, Sep 18, 2012 at 7:52 AM, Peter Brawley
>> >> <peter.braw...@earthlink.net> wrote:
>> >> > On 2012-09-18 5:53 AM, Larry Martell wrote:
>> >> >>
>> >> >> I have this query:
>> >> >>
>> >> >> SELECT data_target.name as Target,
>> >> >>         q1.ep as EP,
>> >> >>         COUNT(*) as Wafers,
>> >> >>         Lots,
>> >> >>         SUM(numonep)/(COUNT(*)+SUM(CASE WHEN reruns > 0 THEN
>> >> >> reruns ELSE 0 END)) as 'Sites/Wafer',
>> >> >>         MAX(LastRun) as "Last Run",
>> >> >>         SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END) as
>> Rerun,
>> >> >>         COUNT(*)+SUM(CASE WHEN reruns > 0 THEN reruns ELSE 0 END)
>> >> >> as
>> >> Runs,
>> >> >>         avgbottom as "Avg Bottom",
>> >> >>         3*stdbottom as "3 Sig",
>> >> >>         maxbottom as Max,
>> >> >>         minbottom as Min,
>> >> >>         SUM(numonep) as Count,
>> >> >>         SUM(numonep) - SUM(numbottoms) as NAs,
>> >> >>         100-((SUM(numonep) - SUM(numbottoms))/SUM(numonep)*100)
>> as
>> >> "%
>> >> >> Success",
>> >> >>         3*stdbottom/avgbottom as "3Sig/Avg",
>> >> >>         AVG(avgbottom) as 'Wafer Avg',
>> >> >>         AVG(Wafer3Sigma) as 'Wafer 3 Sigma',
>> >> >>         AVG(Ranges) as 'Avg Range',
>> >> >>         3*STD(Ranges) as '3Sig of Ranges',
>> >> >>         MAX(Ranges) as 'Max Range',
>> >> >>         MIN(Ranges) as 'Min Range',
>> >> >>         (SUM(numonep) - SUM(numbottoms))/COUNT(*) as 'NAs/Wafer'
>> >> >> FROM (SELECT target_name_id,
>> >> >>               ep,
>> >> >>               wafer_id,
>> >> >>               COUNT(bottom) as numbottoms,
>> >> >>               AVG(bottom) as avgbottom,
>> >> >>               STD(bottom) as stdbottom,
>> >> >>               MAX(bottom) as maxbottom,
>> >> >>               MIN(bottom) as minbottom,
>> >> >>               MAX(date_time) as "LastRun",
>> >> >>               COUNT(*) as numonep,
>> >> >>               COUNT(DISTINCT target_name_id, ep, lot_id,
>> >> >> data_file_id)-1 as reruns,
>> >> >>               COUNT(DISTINCT(lot_id)) as Lots,
>> >> >>               3*STD(bottom) as Wafer3Sigma,
>> >> >>               MAX(bottom) - MIN(bottom) as Ranges
>> >> >>        FROM data_cst
>> >> >>        WHERE target_name_id IN (775, 776, 777, 778, 779, 780, 45,
>> >> 44,
>> >> >> 116, 117, 118, 119, 120, 121)
>> >> >>        AND data_cst.date_time BETWEEN '2010-03-04 00:00:00' AND
>> >> >> '2010-03-04 23:59:59'
>> >> >>        GROUP BY target_name_id, ep, wafer_id
>> >> >>        HAVING count(*) < 999) q1,
>> >> >>       data_target
>> >> >> WHERE data_target.id = target_name_id GROUP BY q1.target_name_id,
>> >> >> q1.ep;
>> >> >>
>> >> >>
>> >> >> This works fine. But now I need to get a certain column
>> >> >> (image_measurer_id) with each row returned that corresponds to
>> the
>> >> >> row from the group that has bottom = Min(bottom), bottom =
>> >> >> Max(bottom), bottom closest to Avg(bottom), and bottom from the
>> >> >> row where date_time = Max(date_time).
>> >> >>
>> >> >> Is this even possible from one query?
>> >> >
>> >> >
>> >> > Might be, but what's the importance of doing it as one query?
>> >>
>> >> Because it's part of a django based web app, and the class that this
>> >> is part of only supports having a single query. To use multiple
>> >> queries will require a fairly major rewrite of the server side of
>> that app.
>> >>
>> >> > I'd start by
>> >> > saving this result to a temp table and developing the new query.
>> >> > When that's running, see if you can to optimise a query built by
>> >> > replacing the reference to the temp table with the original query.
>> >>
>> >> Thanks, I'll look into this.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Reply via email to