On 2014/10/16 00:05, Michael Falconer wrote:
Hi all,

first off I must start with an apology. I know I'm sort of doing the wrong//...

No need to apologise, this flies quite close to the central theme. Whether you are using SQLite or any other SQL RDBMS, this is horrible DB design and it is so for precisely the reasons you are asking the question. Tables should not be dynamic: The simple rule of thumb being - if you cannot tell (or at least accurately predict) before-hand exactly how many tables will be in the final system, then you are doing it wrong. Why not have one set of tables and in stead of preceding each of them with a <prefix>, simply add a column that can host the <prefix> as a simple value, which will immediately make your life very much easier and get the SQL engine to do the work you are now trying to compensate for in a lop-sided manual kind of way.

I can only imagine the amount of code you invested into your system to track and deal with these dynamic sets of tables, so I know changing it will seem like a rather large undertaking, but really it will mostly involve removing loads of code to end up with a small set of simple queries that does all the work for you.

The only reason I can imagine this sort of breakdown useful is if your tree tables are all really really huge, like Gigabytes, and so there might be some efficiency to be gained from splitting it up, but I doubt this is the case.

Just imagine the ease of those search queries you demonstrated... one query, no UNIONs, a single set of joins and an extra where-clause check... Add to that increased efficiency at finding results (run one query in stead of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one set of tables as per your list, add column(s) for prefixing as needed, import all the tables from some large set of them (specifying additionally the column for the prefix) and ten run search queries on them. I am willing to bet on the fact it will be much quicker in addition to being much simpler.

If you cannot change the schemata to be more suitable, then your demonstrated queries are very near as good as it gets. I can think of other ways to achieve the same, but nothing that would be more efficient to a tangible level. Maybe someone else here can come up with something a bit more tight.

thing here as this question is NOT related to sqlite. It is a general SQL
question but I ask it here because I have great respect for the answers and
discussions I have seen on this forum over many years. I rarely post myself
as there are always several contributors who beat me to the answer and
often their response is far better than mine would have been. I'm not a
code leper, I don't need the actual SQL just the method really, though a
short code example would be well received for illustration.

It's about a search performed on multiple tables. However the structure of
this database is somewhat unorthodox. It contains genealogical data and
this is clustered into trees. Each tree has 8 tables, there can be as many
as 100 trees but most installations of the associated application software
contain between 5 - 50 trees. These 8 tables contain a family trees data
and are named:
<prefix>addresses
<prefix>connections
<prefix>events
<prefix>family
<prefix>person
<prefix>repositories
<prefix>sources
<prefix>texts

The <prefix> changes and is unique for each tree.

There are extensive search options offered to users as well as simple name
searching and it is one of the best features of the app. It works pretty
well, so it ain't broke in any sense, we just wonder if there is a better
way to perform this search in SQL. Is there a general technique which is
superior either in speed, efficiency or load bearing contexts?

I am sure you can see one of the pitfalls here is the exponential growth of
such a search query as both total number of trees and indeed user search
criteria increase. For each criteria component, and there are quite a few,
the appropriate tables must be queried and results joined.

Searches return records of individuals meeting the entered search criteria
so the query focuses on the person table as it's anchor, performs searches
on required other tables in the tree and joins the results to the person
data. The results from each tree search are then UNION'ed to provide the
result set. Here is a contrived example of the SQL query code which should
make things clearer.

The user enters simple search criteria - any person with a last name
containing 'mac' and a first name containing the character 'a'. This is the
resulting query (generated by php code), which searches a small 4 family
tree installation.

(SELECT humo1_person.*, event_kind, event_event, address_place, address_zip
     FROM humo1_person
     LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id
     LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id
         WHERE pers_lastname LIKE '%mac%'
           AND (pers_firstname LIKE '%a%'
               OR (event_kind='name' AND event_event LIKE '%a%')
           )
     GROUP BY pers_gedcomnumber
)
UNION
     (SELECT humo2_person.*, event_kind, event_event, address_place,
address_zip
         FROM humo2_person LEFT JOIN humo2_events ON
pers_gedcomnumber=event_person_id
         LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_person_id
             WHERE pers_lastname LIKE '%mac%'
               AND (pers_firstname LIKE '%a%'
                   OR (event_kind='name' AND event_event LIKE '%a%') )
         GROUP BY pers_gedcomnumber
     )
UNION
     (SELECT humo3_person.*, event_kind, event_event, address_place,
address_zip
         FROM humo3_person LEFT JOIN humo3_events ON
pers_gedcomnumber=event_person_id
         LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_person_id
             WHERE pers_lastname LIKE '%mac%'
               AND (pers_firstname LIKE '%a%'
                   OR (event_kind='name' AND event_event LIKE '%a%')
               )
         GROUP BY pers_gedcomnumber
     )
UNION
     (SELECT humo4_person.*, event_kind, event_event, address_place,
address_zip
         FROM humo4_person
         LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id
         LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_person_id
             WHERE pers_lastname LIKE '%mac%'
               AND (pers_firstname LIKE '%a%'
                   OR (event_kind='name' AND event_event LIKE '%a%')
               )
         GROUP BY pers_gedcomnumber
     )
     ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30

Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
completely off the grid? I do use sqlite quite a bit, but not on this
particular project.



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to