New topic: SQL query
<http://forums.realsoftware.com/viewtopic.php?t=31390> Page 1 of 1 [ 11 posts ] Previous topic | Next topic Author Message brisance Post subject: SQL queryPosted: Thu Dec 03, 2009 1:11 am Joined: Tue Oct 06, 2009 2:38 am Posts: 25 Need some help writing a SQL query. There's a table with columns of employee id, hours worked and date of entry. I wish to get the top 5 employees that worked the most hours between 2 dates. e.g. let's say Employee A worked a total of 50 hours between a 40 day period, and I wish to return all rows that make up these 50 hours for Employee A and the other 4 employees who worked the most hours. In other words, the first entry should be the employee (group by employee id) who worked the most hours followed by a bunch of rows in descending order (order by hours worked). Repeat this for the other 4 employees. How would I go about this? Top dglass Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 1:51 am Joined: Fri Sep 30, 2005 9:29 am Posts: 675 Location: California Code:SELECT employeeID, hoursWorked, dateOfEntry FROM employeeActivity WHERE employeeID IN (SELECT employeeID, SUM(hoursWorked) FROM employeeActivity WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate ORDER BY SUM(hoursWorked) DESC LIMIT 5) ORDER BY employeeID ASC, hoursWorked DESC Just off the top of my head, so it may not be exactly right. However, if this is for SQLite/REALSQLDatabase then you'll have to break it into two queries (one to get the top 5 workers (employeeID, most likely), and the second to get the hoursWorked records) because SQLite doesn't accept LIMIT in a compound query unless it is in the last SELECT. Not sure if other DB engines have similar restrictions on the use of LIMIT. Top brisance Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 4:45 am Joined: Tue Oct 06, 2009 2:38 am Posts: 25 Hi there, Thanks for replying. The subquery can only return one column... at least on MS SQL Server (what I'm stuck with). Top dglass Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 11:48 am Joined: Fri Sep 30, 2005 9:29 am Posts: 675 Location: California brisance wrote:Hi there, Thanks for replying. The subquery can only return one column... at least on MS SQL Server (what I'm stuck with). In that case you are probably going to have to do it in two steps; the first to get the top five workers, and the second to get their hours. Although I suppose you could try something like the following as your subquery: SELECT employeeID FROM employeeActivity WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate ORDER BY SUM(hoursWorked) DESC LIMIT 5 Top brisance Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 9:01 pm Joined: Tue Oct 06, 2009 2:38 am Posts: 25 This will only return the top 5 records, not the top 5 employees. This is what I need to return: Code:Employee ID Hours worked Date 0023 14 <some_date> 0023 12 <some_date> 0023 11 <some_date> 0023 10 <some_date> 0023 9 <some_date> 0023 8.5 <some_date> 0034 17 <some_date> 0034 5 <some_date> .and 3 other employee IDs So far I got the following pseudocode going. 1. First, compute SUM(hours_worked) for Employee ID between dates. (HAVING Date BETWEEN xx AND yy) 2. Based on SUM(hours_worked), return the top 5 employees. 3. Now that we have the top 5 employees, return only the records between these dates, and group them by Employee ID, order by Hours worked. Notice that row 7 has higher hours worked than row 1, but is lower in the list because the SUM(hours_worked) is less than Employee 0023.This is the part I'm having trouble with. dglass wrote:In that case you are probably going to have to do it in two steps; the first to get the top five workers, and the second to get their hours. Although I suppose you could try something like the following as your subquery: SELECT employeeID FROM employeeActivity WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate ORDER BY SUM(hoursWorked) DESC LIMIT 5 Top dglass Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 9:34 pm Joined: Fri Sep 30, 2005 9:29 am Posts: 675 Location: California I don't have your dataset, so I can't really test any of this SQL.... SELECT employeeID, SUM(hoursWorked) FROM employeeActivity WHERE dateWorked >= StartDate AND dateWorked <= StopDate GROUP BY employeeID ORDER BY SUM(hoursWorked) DESC LIMIT 5 That should get you five records (employeeID and a total hours) that have the highest hoursWorked total for the time period specified. Then.... SELECT employeeID, hoursWorked, dateWorked FROM employeeActivity WHERE employeeID IN (aVariableHoldingEmployeeIDs) AND (dateWorked >= StartDate AND dateWorked <= StopDate) ORDER BY employeeID ASC, hoursWorked DESC should get you the records you want in the format you want them. I would have thought, but again, can't test, that if you changed the first query to return only the employeeIDs, you could use it as a subquery rather than having to get the IDs, put them in a variable, and use that in the second query. Top dglass Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 9:55 pm Joined: Fri Sep 30, 2005 9:29 am Posts: 675 Location: California I created a small test DB (SQLite) and ran the following SQL through it: SELECT employeeID, hoursWorked, dateWorked FROM employeeActivity WHERE employeeID IN ( SELECT employeeID FROM employeeActivity WHERE dateWorked >= '2009-01-01' AND dateWorked <= '2009-01-04' GROUP BY employeeID ORDER BY SUM(hoursWorked) DESC LIMIT 2 ) AND (dateWorked >= '2009-01-01' AND dateWorked <= '2009-01-04') ORDER BY employeeID ASC, hoursWorked DESC it returned the data expected, in the format expected Note the subquery only returns a single field (employeeID) so it should meet the MSSQL requirement. I also limited the subquery to 2 records a) to verify it was returning what was expected, and b) because I didn't put a lot of data into my test DB. I think the only thing it doesn't do, which may be one of your requirements, is order the final output in employeeID order BY total hours worked, i.e. 004 worked more total hours than 002, so 004's entries should be listed first. It just puts them in employeeID order, so 002 is before 004. Top brisance Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 11:40 pm Joined: Tue Oct 06, 2009 2:38 am Posts: 25 dglass, thank you for taking time to reply. Greatly appreciated. Your SQL is eerily similar to what I have , and you're right: I need to output it in the specified format. Top brisance Post subject: Re: SQL queryPosted: Thu Dec 03, 2009 11:45 pm Joined: Tue Oct 06, 2009 2:38 am Posts: 25 One of the problems is that there is no primary key in the table, and I cannot execute any ALTER statements. Should I just retrieve the top 5, pack them off into their own SELECT statements and then UNION the results? Am I on the right path? Top dglass Post subject: Re: SQL queryPosted: Fri Dec 04, 2009 12:13 am Joined: Fri Sep 30, 2005 9:29 am Posts: 675 Location: California brisance wrote:Should I just retrieve the top 5, pack them off into their own SELECT statements and then UNION the results? Am I on the right path? Probably. It sounds pretty ugly, but since you have to have an order that isn't 'holdable' across the different SELECTs it's probably the only option. Top brisance Post subject: Re: SQL queryPosted: Fri Dec 04, 2009 12:56 am Joined: Tue Oct 06, 2009 2:38 am Posts: 25 Yeah that's what I feared. Thought it might be something obvious I missed like "ROLLUP" or "CUBE" cos I'm terrible at SQL. Top Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending Page 1 of 1 [ 11 posts ] -- Over 1500 classes with 29000 functions in one REALbasic plug-in collection. The Monkeybread Software Realbasic Plugin v9.3. http://www.monkeybreadsoftware.de/realbasic/plugins.shtml [email protected]
