Re: Another SQL question

Brian Yennie Thu, 05 Feb 2009 11:18:48 -0800

Hey Bob,

Hope these notes help - sounds like a fun project.

It looks like your query above is going the opposite direction,looking up customers that match the deptid in departments. I amlooking for department records whose deptid only exist in thecustomer cursor. More on paging later.

Other than returning fields from both tables, my query should returnidentical results. It may just be a matter of style, but using IN witha sub-select seemed like overkill if you are just trying to match twotables based on the "deptid" field. Add a third table and the joingrows elegantly, but you will probably not want to nest another SELECTfor each table. If you just want fields from the departments table,you could say:

SELECT DISTINCT departments.* FROM customers,departments WHEREcustomers.deptid = departments.deptid


If you had a short list of customers, you could say something like:

SELECT customer.id, department.id, department.name FROMcustomers,departments WHERE customer.id IN (1,2,3,4,5,6...) ANDcustomers.deptid = departments.deptid


Now you've got all of your departments listed by customer.

Aye, I could return all the data with one query. The problem thoughis that my app is going to contain some fairly complex queries withmultiple relations, and I have to construct these queries viaconditional coding. Also, I do not WANT to return data from 2 tablesin one cursor. First of all, I have like data in multiple tables(deptid in customers and also in departments for example) as well asa signature field, a deleted flag field, and a unique ID field inEVERY table. I need these values for every record in every table.That would mean I would have to use column aliases, so now my SQLconstruction code would have to be orders of magnitude morecomplicated, and much more difficult to troubleshoot should thequeries not return what I expect. Not impossible, just complicated.

You know your app best, but I would raise a red flag here. Avoidingmultiple table queries could really set you back if you want to scaleto large databases. When you've got 1,000,000 records in a table, yourdatabase engine won't blink joining it to another table provided youhave indexed fields and a reasonable query. However, if you try todump 100,000 of those IDs into an IN() portion of a subsequent query,prepare to wait. (NOTE: a sub-select can still work here, but may beharder to generate on the fly, especially with 3+ tables). If thesesubsequent queries are always going to use just a small "page" ofrecords to join against, then it's probably fine either way.

On column aliases... What API are you using to fetch your fieldvalues? Could you just use a fully qualified naming convention:SELECT department.name AS department_name, customer.name AScustomer_name FROM ...

Alternatively, you could reference fields by number, and track thefield names outside of SQL. In that case, it's valid SQL to just usethe same field names:

SELECT department.name, customer.name FROM ...

(fetch department.name as field #1, customer.name as field #2).

The method I am using instead is to return each table as it's owncursor, resulting in MUCH simpler code to construct the queries, andsimpler queries themselves. Additionally, the signature, delete anduniqueid fields can never be confused beacause each cursor for eachtable retains it's own identity. So how is it relational? Well Iwrote my own relational system into my application, so that when Inavigate to a record in a cursor, I also look up the records in andchild subservient to that cursor. (IMHO that is what relationalreally means). Now all I have to do is get values from the currentrecords in each cursor.

This is perfectly fine if it works for you. Just keep in mind that itreally doesn't scale if you have large cursors. Performing 100 or 1000queries versus 1 multi-table query will be no contest. But if yourparent cursor is always trimmed to a reasonable size, you'll be fine.No doubt simpler code and simpler queries are a big win when you aretrying to write a tool for the general case. I would just be carefulif you are worried about performance not to do this (pseudo-code):


## N + 1 queries for N customers

put query("SELECT id,deptid FROM customers WHERE state = 'CA'") intocustomerData

repeat for each line customerDetails in customerData
   put item 1 of customerDetails into customerID
   put item 2 of customerDetails into deptid

put query("SELECT * FROM department WHERE deptid = "&deptid) intodepartments[customerID]

end repeat

## 1 query

put query("SELECT customer.id,department.id,department.name FROMcustomer,department WHERE customer.deptid = department.deptid") intocustomerData

Again, it depends. If N = 25, then maybe you stick with the cleanerlooking code. But if N = 10,000 ...

I also have the advantage of being able to present the entire childcursor to the user in a one-to-many environment, as in an invoicewhere there is a master record and many detail records. My tabledata is ready made for my invoice detail. It's all in it's own cursor.

Sounds good, although I'm not sure I see how multi-table queries wouldstop you from doing this. If anything, it will just get you all of themaster/detail information in one "combined" table with exactly thefields you want instead of having to grab a parent record and thenassemble all of the child data with separate queries.

The LIMIT is necessary because I am working towards a paging systemthat will handle HUGE databases of unlimited size. This will getaround Revolutions limits on how much data can be returned in acursor (the limits of which I still do not have a definitive answeron). This is why I need to use the SELECT for the master table as alookup in the child table.

Amen, LIMIT is your friend. Keep in mind, it works with multi-tablequeries as well. You can also use DISTINCT to remove duplicates. I'mnot sure I follow the second comment. A multiple table query won'treturn any larger data sets unless you include more fields in yourquery.

I do not use Revolution's built in queries because what I need to dois more than just read data from a table and let the user edit it. Ineed to do complex validations. For instance using the exampleabove, if someone edits a deptid in the department table, I need togo find every other table that uses deptid, look up the old valueand change it to the new value. Otherwise I break the relationallink between those records.

Keep in mind that most database engines have built-in abilities tomanage relations. That's not to say you can't do it yourself, but theuse-case you describe above is handled automatically by MySQL,Valentina, SQL Server, etc. You just need to set up the constraints inyour database schema and you can do cascading deletes, set fields toNULL, or throw an exception depending on your needs

Generally speaking, your approach may be fine for your application. Itsounds like you are doing things from a 'FileMaker' like view whereyou only have a small number of records visible at a time. Thus therenever will be a case where you have 100,000 parent records to dealwith at once. However, if later you want more advanced reporting orquerying capabilities, you may need the scalability.


HTH
_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Re: Another SQL question

Reply via email to