New topic: 

SQL query

<http://forums.realsoftware.com/viewtopic.php?t=31390>

       Page 1 of 1
   [ 11 posts ]                 Previous topic | Next topic         Author  
Message       brisance           Post subject: SQL queryPosted: Thu Dec 03, 
2009 1:11 am                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              Need some help writing a SQL query.

There's a table with columns of employee id, hours worked and date of entry. 

I wish to get the top 5 employees that worked the most hours between 2 dates.

e.g. let's say Employee A worked a total of 50 hours between a 40 day period, 
and I wish to return all rows that make up these 50 hours for Employee A and 
the other 4 employees who worked the most hours. In other words, the first 
entry should be the employee (group by employee id) who worked the most hours 
followed by a bunch of rows in descending order (order by hours worked). Repeat 
this for the other 4 employees. How would I go about this?   
                            Top                dglass           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 1:51 am                        
Joined: Fri Sep 30, 2005 9:29 am
Posts: 675
Location: California              Code:SELECT employeeID, hoursWorked, 
dateOfEntry
  FROM employeeActivity
  WHERE employeeID IN
  (SELECT employeeID, SUM(hoursWorked)
  FROM employeeActivity
  WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate
  ORDER BY SUM(hoursWorked) DESC LIMIT 5)
  ORDER BY employeeID ASC, hoursWorked DESC
  


Just off the top of my head, so it may not be exactly right.

However, if this is for SQLite/REALSQLDatabase then you'll have to break it 
into two queries (one to get the top 5 workers (employeeID, most likely), and 
the second to get the hoursWorked records) because SQLite doesn't accept LIMIT 
in a compound query unless it is in the last SELECT.

Not sure if other DB engines have similar restrictions on the use of LIMIT.   
                            Top               brisance           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 4:45 am                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              Hi there,

Thanks for replying. The subquery can only return one column... at least on MS 
SQL Server (what I'm stuck with).   
                            Top                dglass           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 11:48 am                        
Joined: Fri Sep 30, 2005 9:29 am
Posts: 675
Location: California              brisance wrote:Hi there,

Thanks for replying. The subquery can only return one column... at least on MS 
SQL Server (what I'm stuck with).

In that case you are probably going to have to do it in two steps; the first to 
get the top five workers, and the second to get their hours.

Although I suppose you could try something like the following as your subquery:

SELECT employeeID
FROM employeeActivity
WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate
ORDER BY SUM(hoursWorked) DESC LIMIT 5   
                            Top               brisance           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 9:01 pm                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              This will only return the top 5 records, not the top 5 
employees. This is what I need to return:

Code:Employee ID  Hours worked  Date
0023     14      <some_date>
0023     12      <some_date>
0023     11      <some_date>
0023     10      <some_date>
0023     9        <some_date>
0023     8.5      <some_date>
0034     17       <some_date>
0034     5       <some_date>

.and 3 other employee IDs

So far I got the following pseudocode going.

1. First, compute SUM(hours_worked) for Employee ID between dates. (HAVING Date 
BETWEEN xx AND yy)
2. Based on SUM(hours_worked), return the top 5 employees.
3. Now that we have the top 5 employees, return only the records between these 
dates, and group them by Employee ID, order by Hours worked. Notice that row 7 
has higher hours worked than row 1, but is lower in the list because the 
SUM(hours_worked) is less than Employee 0023.This is the part I'm having 
trouble with.


dglass wrote:In that case you are probably going to have to do it in two steps; 
the first to get the top five workers, and the second to get their hours.

Although I suppose you could try something like the following as your subquery:

SELECT employeeID
FROM employeeActivity
WHERE dateOfEntry >= StartDate AND dateOfEntry <= EndDate
ORDER BY SUM(hoursWorked) DESC LIMIT 5   
                            Top                dglass           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 9:34 pm                        
Joined: Fri Sep 30, 2005 9:29 am
Posts: 675
Location: California              I don't have your dataset, so I can't really 
test any of this SQL....

SELECT employeeID, SUM(hoursWorked) FROM employeeActivity WHERE dateWorked >= 
StartDate AND dateWorked <= StopDate GROUP BY employeeID ORDER BY 
SUM(hoursWorked) DESC LIMIT 5

That should get you five records (employeeID and a total hours) that have the 
highest hoursWorked total for the time period specified.

Then....

SELECT employeeID, hoursWorked, dateWorked FROM employeeActivity WHERE 
employeeID IN (aVariableHoldingEmployeeIDs) AND (dateWorked >= StartDate AND 
dateWorked <= StopDate) ORDER BY employeeID ASC, hoursWorked DESC

should get you the records you want in the format you want them.

I would have thought, but again, can't test, that if you changed the first 
query to return only the employeeIDs, you could use it as a subquery rather 
than having to get the IDs, put them in a variable, and use that in the second 
query.   
                            Top               dglass           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 9:55 pm                        
Joined: Fri Sep 30, 2005 9:29 am
Posts: 675
Location: California              I created a small test DB (SQLite) and ran 
the following SQL through it:

SELECT employeeID, hoursWorked, dateWorked 
FROM employeeActivity 
WHERE employeeID IN (
   SELECT employeeID 
   FROM employeeActivity 
   WHERE dateWorked >= '2009-01-01' AND dateWorked <= '2009-01-04' 
   GROUP BY employeeID 
   ORDER BY SUM(hoursWorked) DESC LIMIT 2
   ) 
AND (dateWorked >= '2009-01-01' AND dateWorked <= '2009-01-04') 
ORDER BY employeeID ASC, hoursWorked DESC

it returned the data expected, in the format expected

Note the subquery only returns a single field (employeeID) so it should meet 
the MSSQL requirement.

I also limited the subquery to 2 records a) to verify it was returning what was 
expected, and b) because I didn't put a lot of data into my test DB.

I think the only thing it doesn't do, which may be one of your requirements, is 
order the final output in employeeID order BY total hours worked, i.e. 004 
worked more total hours than 002, so 004's entries should be listed first.  It 
just puts them in employeeID order, so 002 is before 004.   
                            Top               brisance           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 11:40 pm                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              dglass, thank you for taking time to reply. Greatly 
appreciated.

Your SQL is eerily similar to what I have , and you're right: I need to output 
it in the specified format.   
                            Top                brisance           Post subject: 
Re: SQL queryPosted: Thu Dec 03, 2009 11:45 pm                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              One of the problems is that there is no primary key in 
the table, and I cannot execute any ALTER statements.

Should I just retrieve the top 5, pack them off into their own SELECT 
statements and then UNION the results? Am I on the right path?   
                            Top                dglass           Post subject: 
Re: SQL queryPosted: Fri Dec 04, 2009 12:13 am                        
Joined: Fri Sep 30, 2005 9:29 am
Posts: 675
Location: California              brisance wrote:Should I just retrieve the top 
5, pack them off into their own SELECT statements and then UNION the results? 
Am I on the right path?

Probably.  It sounds pretty ugly, but since you have to have an order that 
isn't 'holdable' across the different SELECTs it's probably the only option.   
                            Top               brisance           Post subject: 
Re: SQL queryPosted: Fri Dec 04, 2009 12:56 am                        
Joined: Tue Oct 06, 2009 2:38 am
Posts: 25              Yeah that's what I feared. Thought it might be something 
obvious I missed like "ROLLUP" or "CUBE" cos I'm terrible at SQL.   
                            Top            Display posts from previous: All 
posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost 
timeSubject AscendingDescending          Page 1 of 1
   [ 11 posts ]     
-- 
Over 1500 classes with 29000 functions in one REALbasic plug-in collection. 
The Monkeybread Software Realbasic Plugin v9.3. 
http://www.monkeybreadsoftware.de/realbasic/plugins.shtml

[email protected]

Reply via email to