Yes, I understand that. But, unless you have a version of MySQL that supports subqueries (V4.1 or later), it's a two step process. The first query I gave you will identify all of the batches that are duplicates by their batch number:
select Batch, count(*) as count from QA group by Batch having count >= 2; The second query I gave you will tell you the other things you want to know about the batches: SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID FROM QA WHERE Batch in ('439584414', '123456', '999444'); Does it make sense now? Rhino ----- Original Message ----- From: "Scott Hamm" <[EMAIL PROTECTED]> To: "Rhino" <[EMAIL PROTECTED]> Cc: "Mysql" <mysql@lists.mysql.com> Sent: Tuesday, April 05, 2005 1:02 PM Subject: Re: Duplicated records > Well, to be more specific, I would like to list *ALL* duplicated Batch > rather than to count them. > > SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID > FROM QA > WHERE Batch in (duplicates); > > On Apr 5, 2005 12:43 PM, Rhino <[EMAIL PROTECTED]> wrote: > > Oh, that's an entirely different problem than the one I understood from your > > original question. > > > > Your new explanation is clearer but I'm still not sure I follow all of it. > > The formatting/wrapping of the example makes things a bit fuzzy too. Am I > > right in assuming that on the first row of the result, the OperatorID is > > 2661, the QAID is 5334 and the NTID is JulieAnt? If so, is JulieAnt the > > reviewer? Also are the QAID and the NTID different ways of saying the same > > thing, such as the fact that JulieAnt's employee number is 5334, or is it > > just a coincidence that they are the same on both rows? > > > > I'm assuming that a Batch has two different operators if one shift starts > > the batch but the batch isn't finished by the time the shift ends so a > > second operator finishes the Batch. Will you ever have cases where a Batch > > takes so long that it takes more than two shifts to finish it? For instance, > > would you ever see a case where the company gets an order for a billion > > grapple grommets and it takes 9 shifts to finish it? I'm trying to figure > > out if you also need to worry about batches that are split over more than > > two shifts. > > > > Does the Batch value uniquely identify a particular Batch or is it the > > combination of Batch and KeyDate? > > > > Ok, assuming for the moment that Batch alone uniquely identifies a > > particular Batch, and that you care about Batches that are split across two > > *or more* shifts, you need a query like this to find those Batches: > > > > select Batch, count(*) as count > > from QA > > group by Batch > > having count >= 2; > > > > That will return the Batch numbers alone. > > > > Then, you simply do a second Select to get the other properties of the Batch > > that you care about. That query would look very much like the one in your > > example except that you would do it like so: > > > > SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID > > FROM QA > > WHERE Batch in ('439584414', '123456', '999444'); > > > > In other words, you'd put a list consisting of all of the values returned > > from the one that got the Batch numbers into the IN clause. > > > > Now, that could be a little tedious if there were a lot of 'split' batches > > and you may also run into an issue if the number of of values in the IN > > clause exceeded MySQL's limit, whatever that is. (Or maybe MySQL has a limit > > on the maximum *length* of the IN clause; if it does, that could bite you > > too.) To get around that, you could just repeat the select with the IN > > clause for subsets of the results from the first query or even just do your > > original query once for each of the values returned by the counting query. > > > > A much better alternative would be to use a subquery but that is only an > > option if you are running MySQL 4.1 or later. That would let you combine > > both queries together into a single big query. Are you running V4.1 or > > later? If you are, I could take a stab at the combined query.... > > > > Rhino > > > > > > ----- Original Message ----- > > From: "Scott Hamm" <[EMAIL PROTECTED]> > > To: "Rhino" <[EMAIL PROTECTED]> > > Cc: "Mysql" <mysql@lists.mysql.com> > > Sent: Tuesday, April 05, 2005 11:14 AM > > Subject: Re: Duplicated records > > > > > Sorry for the confusion. I don't mean a duplicated records, but to > > > find duplicated Batch where two different operators worked on a single > > > batch (one started off, then another one to finish the batch) and a > > > single reviewer to review a batch. I need a list that lists duplicated > > > Batches with different operators that worked on that batch. > > > > > > For example: (for clarification) > > > SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID > > > FROM QA > > > WHERE Batch='439584414'; > > > > > > ID Batch KeyDate OperatorID > > > QAID NTID > > > 90577 439584414 2004-10-03 00:00:00 2661 5334 JulieAnt > > > 90575 439584414 2004-10-03 00:00:00 5657 5334 JulieAnt > > > > > > > > > On Apr 5, 2005 10:54 AM, Rhino <[EMAIL PROTECTED]> wrote: > > > > > > > > ----- Original Message ----- > > > > From: "Scott Hamm" <[EMAIL PROTECTED]> > > > > To: "Mysql" <mysql@lists.mysql.com> > > > > Sent: Tuesday, April 05, 2005 10:05 AM > > > > Subject: Duplicated records > > > > > > > > > Here is my novice question: > > > > > I need to find duplicates in "Batch" category when I issued this > > query: > > > > > > > > > > SELECT * FROM QA WHERE Batch=439584414; > > > > > > > > > > Result: > > > > > 90577 > > > > > 1 2661 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03 > > > > > 00:00:00 0 90575 > > > > > 1 5657 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03 > > > > 00:00:00 0 > > > > > > > > > > How do I issue a query that finds duplicated Batch number? > > > > > > > > > First of all, I think you made a mistake when putting this data into > > your > > > > note, a mistake which is compounded by the wrapping that the email > > program > > > > did. I *think* you meant to display two rows with the '90577' value at > > the > > > > end of the second row, not before the first row. This makes the example > > a > > > > bit confusing and hard to follow. > > > > > > > > The two rows you provide in your example are not duplicates of each > > other. > > > > Even assuming that the '90577' actually belongs at the end of the second > > row > > > > rather than the start of the first row, the two rows are different: the > > > > first row ends in '90575', not '90577' and the second value in each row > > is > > > > also different: '2661' and '5657'. Therefore, you're not really trying > > to > > > > find duplicate records because a duplicate row would be one that is > > > > identical in every column, not just identical in several columns. I hope > > > > that doesn't sound like I am splitting hairs; maybe I am. ;-) > > > > > > > > Anyway, the standard method for finding duplicates of rows goes like > > this > > > > and can be adapted for your situation. Let's say that you have a table > > > > called Foo with columns foo1, foo2, and foo3 and you want to find all of > > the > > > > duplicate rows in the table, i.e. any row whose foo1, foo2, and foo3 > > value > > > > is identical to the foo1, foo2, and foo3 value of another table. You > > would > > > > write this query: > > > > > > > > select foo1, foo2, foo3, count(*) as num > > > > from Foo1 > > > > group by foo1, foo2, foo3 > > > > having num > 1 > > > > order by num; > > > > > > > > Your result might look like this: > > > > > > > > foo1 foo2 foo3 num > > > > Daffy D Duck 1 > > > > Bugs B Bunny 1 > > > > Elmer A Fudd 3 > > > > > > > > This result means that Daffy D Duck and Bugs B Bunny each occur once in > > the > > > > table but Elmer A Fudd occurs three times. > > > > > > > > In the case you mentioned in your email, where the rows aren't complete > > > > duplicates of one another, just write the query so that you omit the > > columns > > > > which are allowed to be different. In your case, that would appear to be > > all > > > > but the second and last columns. > > > > > > > > However, I think if you get to the point of having duplicate records in > > a > > > > table when you don't want them, finding them is like shutting the barn > > door > > > > after the horse is already gone. I think you need to reconsider your > > design, > > > > specifically your primary key, and change that primary key to *PREVENT* > > the > > > > duplicate keys in the first place. > > > > > > > > For instance, in the case of the Foo1 table, I can prevent duplicate > > rows by > > > > choosing a good primary key; in this case I would choose the combination > > of > > > > all three columns in the table; that would ensure that I never stored > > more > > > > than 1 person named Elmer A Fudd. [Actually, that's not a great example! > > It > > > > might be perfectly valid to have two people named Elmer A Fudd - or John > > A > > > > Smith - so I probably need to add additional columns to my table to > > ensure > > > > that the rows are unique; something like Social Security Number which is > > > > (supposed to be) unique would be ideal for this purpose.] > > > > > > > > Rhino > > > > > > > > -- > > > > No virus found in this outgoing message. > > > > Checked by AVG Anti-Virus. > > > > Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005 > > > > > > > > > > > > > > > > > -- > > > Power to people, Linux is here. > > > > > > -- > > > MySQL General Mailing List > > > For list archives: http://lists.mysql.com/mysql > > > To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] > > > > > > > > > -- > > > No virus found in this incoming message. > > > Checked by AVG Anti-Virus. > > > Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005 > > > > > > > > > > -- > > No virus found in this outgoing message. > > Checked by AVG Anti-Virus. > > Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005 > > > > > > > -- > Power to people, Linux is here. > > > -- > No virus found in this incoming message. > Checked by AVG Anti-Virus. > Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005 > -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005 -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]