Re: [sqlite] Search query alternatives.
I am glad I posted the question. Yes James, there is little I can disagree with in your excellent summary. Even the critique of my perhaps poorly framed question is indeed valid. I take your point regarding spec vs implementation, and in my experience across different rdbms's I have frequently seen evidence which supports your assertions regarding db/os system influence on implementation approaches. Application code too has it's part to play. How do we plan to access the data? There's a strong case too, IMHO, to have flexibility in the design, perhaps leading to differing approaches with the variable types of data we may be storing. The original design decisions were made by someone who was, and still is, essentially a hobby programmer. I don't think he'd ever heard of of Chris Date or Mr Codd at that time and like all novice application programmers, he had little understanding about the effect db design could have on his application and it's source code. His design choices were initially made on the basis of what he could easily understand and what was (as it appeared to him then) easy to program with. It is some time ago, and we who have lived with rdbms's for years get to say, 'that is a horrible design!'. I think R. Smith hit on a point above, regarding code overhead. Yep, plenty of that. And so the lesson is learned the hard way for someone who until recently had viewed normalisation as a way to make coding harder and to slow down the execution of queries. So with that perspective you can perhaps come some way to understanding the why component. On analysis, I agree with suggested design changes at the higher level. i.e. Dynamic tables are at the root of issues going forward. They are requiring tedious application code gymnastics, more difficult query analysis and poorer query performance. I suppose the upside is that it will be a challenge to see what improvements can be made, and that is always fun and games. I kind of like Mr Smith's other suggestion about an SQLITE testbed or prototype. So easy to work with SQLITE, and probably perfect for this task. Thanks all for your contributions. Just FYI James, the application is coded in php and connects to a mysql database. It can be installed either as a browser based, stand alone or client server app. It's common implementation is on low end shared hosts, even free hosting services. So this limits us somewhat to what is commonly allowed on such platforms. Things like Stored Procedures are unfortunately outside our scope when it come to design considerations. Thanks all. On 18 October 2014 02:24, James K. Lowdenwrote: > On Thu, 16 Oct 2014 09:05:51 +1100 > Michael Falconer wrote: > > > we just wonder if there is a better way to perform this search in > > SQL. Is there a general technique which is superior either in speed, > > efficiency or load bearing contexts? > > The simple answer is No, because SQL is a specification, not an > implementation. Different systems implement it differently and > therefore perform differently. Any "general technique" affecting > performance belongs to the implementation per se, not the SQL, which is > a logical construction. SQLite itself has changed its performance > characteristics over the course of its development. > > For that reason, any question of performance has to be answered in > terms of a particular implementation, even its specific version, and > the OS and hardware it's running on. > > That said, there is reason to suppose that a single-table design would > be more efficient. If the queries can be expressed with recursion and > the indexes lead to efficient searches, the query optimizer has less > work to do. It has fewer permutations to consider, and the search is > apt to touch fewer pages. The analysis tools of the system you're > using should be able to confirm or deny that supposition. > > I would remind your fellows, though, that efficiency is not all. The > utility of a model (that is, the database design) is measured by how > well, to its purpose, it describes the real world. Any model that must > be changed as that reality changes in predictable ways isn't really > much of a model; it turns the designer into a component of the model. > By recognizing all trees as one, you generalize your model and make it > do work you are now doing yourself (manually, or in application > logic). By any measure, that makes it a better model. > > HTH. > > --jkl > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Regards, Michael.j.Falconer. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Search query alternatives.
On Thu, 16 Oct 2014 09:05:51 +1100 Michael Falconerwrote: > we just wonder if there is a better way to perform this search in > SQL. Is there a general technique which is superior either in speed, > efficiency or load bearing contexts? The simple answer is No, because SQL is a specification, not an implementation. Different systems implement it differently and therefore perform differently. Any "general technique" affecting performance belongs to the implementation per se, not the SQL, which is a logical construction. SQLite itself has changed its performance characteristics over the course of its development. For that reason, any question of performance has to be answered in terms of a particular implementation, even its specific version, and the OS and hardware it's running on. That said, there is reason to suppose that a single-table design would be more efficient. If the queries can be expressed with recursion and the indexes lead to efficient searches, the query optimizer has less work to do. It has fewer permutations to consider, and the search is apt to touch fewer pages. The analysis tools of the system you're using should be able to confirm or deny that supposition. I would remind your fellows, though, that efficiency is not all. The utility of a model (that is, the database design) is measured by how well, to its purpose, it describes the real world. Any model that must be changed as that reality changes in predictable ways isn't really much of a model; it turns the designer into a component of the model. By recognizing all trees as one, you generalize your model and make it do work you are now doing yourself (manually, or in application logic). By any measure, that makes it a better model. HTH. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Search query alternatives.
Michael, a Guy named Joe Celko elaborated about trees and hierarchies in SQL a few years ago. Have a look here: http://www.amazon.com/Hierarchies-Smarties-Edition-Kaufmann-Management/dp/0123877334/ref=dp_ob_title_bk regards gerd ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Search query alternatives.
Thanks Eduardo, a most interesting link. On 17 October 2014 05:41, Eduardo Morraswrote: > On Thu, 16 Oct 2014 09:05:51 +1100 > Michael Falconer wrote: > > > Hi all, > > > > first off I must start with an apology. I know I'm sort of doing the > > wrong thing here as this question is NOT related to sqlite. It is a > > general SQL question but I ask it here because I have great respect > > for the answers and discussions I have seen on this forum over many > > years. I rarely post myself as there are always several contributors > > who beat me to the answer and often their response is far better than > > mine would have been. I'm not a code leper, I don't need the actual > > SQL just the method really, though a short code example would be well > > received for illustration. > > > > Any thoughts? Suggestions? Missiles? Good approach, bad approach, or > > completely off the grid? I do use sqlite quite a bit, but not on this > > particular project. > > I point you to sqlite closure extension. It may shows you some ideas for > tree implementation and parent/child relations under sql/sqlite. > > http://www.sqlite.org/src/artifact/636024302cde41b2bf0c542f81c40c624cfb7012 > http://www.sqlite.org/src/finfo?name=ext/misc/closure.c > > HTH > > > > > -- > > Regards, > > Michael.j.Falconer. > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > --- --- > Eduardo Morras > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Regards, Michael.j.Falconer. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Search query alternatives.
On Thu, 16 Oct 2014 09:05:51 +1100 Michael Falconerwrote: > Hi all, > > first off I must start with an apology. I know I'm sort of doing the > wrong thing here as this question is NOT related to sqlite. It is a > general SQL question but I ask it here because I have great respect > for the answers and discussions I have seen on this forum over many > years. I rarely post myself as there are always several contributors > who beat me to the answer and often their response is far better than > mine would have been. I'm not a code leper, I don't need the actual > SQL just the method really, though a short code example would be well > received for illustration. > > Any thoughts? Suggestions? Missiles? Good approach, bad approach, or > completely off the grid? I do use sqlite quite a bit, but not on this > particular project. I point you to sqlite closure extension. It may shows you some ideas for tree implementation and parent/child relations under sql/sqlite. http://www.sqlite.org/src/artifact/636024302cde41b2bf0c542f81c40c624cfb7012 http://www.sqlite.org/src/finfo?name=ext/misc/closure.c HTH > > -- > Regards, > Michael.j.Falconer. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users --- --- Eduardo Morras ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Search query alternatives.
Thanks Stephen, good point, I was just after some general results, I do take your point about caching etc. However it is logical to apply UNION ALL if appropriate in preference to UNION which in this context is a bit lazy. I'm a bit annoyed I didn't pick up on it myself, but thankful to Igor for reminding me of the difference in the two statements. On 16 October 2014 11:14, Stephen Chrzanowskiwrote: > Careful with the timing. You may be looking at OS memory caching the > result set instead of pulling from the drive. For best bets, either re-run > both queries several times, ditch the longest and shortest times, then take > the mean or average times and do the comparison that way. > > On Wed, Oct 15, 2014 at 7:54 PM, Michael Falconer < > michael.j.falco...@gmail.com> wrote: > > > Igor, > > > > Nice one. A quick test using 10 lookup trees and the same search > criteria: > > *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)* > > > > Now with UNION ALL replacing UNION: > > *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)* > > > > You weren't kidding about cheaper were you? LOL > > > > > > On 16 October 2014 10:30, Michael Falconer > > > wrote: > > > > > Yes, I agree with the general sentiment. It is not exactly the design I > > > would have chosen either, but it is what exists. Design change is > > occurring > > > but for the moment we are stuck with the current schema. > > > > > > If you cannot change the schemata to be more suitable, then your > > >> demonstrated queries are very near as good as it gets. I can think of > > other > > >> ways to achieve the same, but nothing that would be more efficient to > a > > >> tangible level. Maybe someone else here can come up with something a > bit > > >> more tight. > > >> > > > You are preaching to the converted I'm afraid, but I appreciate the > > points > > > you made. The quoted paragraph probably answers most of my questions > > > really, except for this from Igor: > > > > > > Unless you do expect duplicates and need to eliminate them, use UNION > ALL > > >> - it's much cheaper (this is assuming you insist on keeping multiple > > >> tables). > > >> > > > Cheaper because it drops the operation to remove the duplicate records > > > from the result set? I'm assuming. I'll give that a try. > > > > > > On 16 October 2014 10:05, RSmith wrote: > > > > > >> > > >> On 2014/10/16 00:05, Michael Falconer wrote: > > >> > > >>> Hi all, > > >>> > > >>> first off I must start with an apology. I know I'm sort of doing the > > >>> wrong//... > > >>> > > >> > > >> No need to apologise, this flies quite close to the central theme. > > >> Whether you are using SQLite or any other SQL RDBMS, this is horrible > DB > > >> design and it is so for precisely the reasons you are asking the > > question. > > >> Tables should not be dynamic: The simple rule of thumb being - if you > > >> cannot tell (or at least accurately predict) before-hand exactly how > > many > > >> tables will be in the final system, then you are doing it wrong. Why > > not > > >> have one set of tables and in stead of preceding each of them with a > > >> , simply add a column that can host the as a simple > > value, > > >> which will immediately make your life very much easier and get the SQL > > >> engine to do the work you are now trying to compensate for in a > > lop-sided > > >> manual kind of way. > > >> > > >> I can only imagine the amount of code you invested into your system to > > >> track and deal with these dynamic sets of tables, so I know changing > it > > >> will seem like a rather large undertaking, but really it will mostly > > >> involve removing loads of code to end up with a small set of simple > > queries > > >> that does all the work for you. > > >> > > >> The only reason I can imagine this sort of breakdown useful is if your > > >> tree tables are all really really huge, like Gigabytes, and so there > > might > > >> be some efficiency to be gained from splitting it up, but I doubt this > > is > > >> the case. > > >> > > >> Just imagine the ease of those search queries you demonstrated... one > > >> query, no UNIONs, a single set of joins and an extra where-clause > > check... > > >> Add to that increased efficiency at finding results (run one query in > > stead > > >> of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one > > set > > >> of tables as per your list, add column(s) for prefixing as needed, > > import > > >> all the tables from some large set of them (specifying additionally > the > > >> column for the prefix) and ten run search queries on them. I am > willing > > to > > >> bet on the fact it will be much quicker in addition to being much > > simpler. > > >> > > >> If you cannot change the schemata to be more suitable, then your > > >> demonstrated queries are very near as good as it gets. I can think of > > other > > >> ways to achieve the same, but nothing that would be
Re: [sqlite] Search query alternatives.
Careful with the timing. You may be looking at OS memory caching the result set instead of pulling from the drive. For best bets, either re-run both queries several times, ditch the longest and shortest times, then take the mean or average times and do the comparison that way. On Wed, Oct 15, 2014 at 7:54 PM, Michael Falconer < michael.j.falco...@gmail.com> wrote: > Igor, > > Nice one. A quick test using 10 lookup trees and the same search criteria: > *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)* > > Now with UNION ALL replacing UNION: > *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)* > > You weren't kidding about cheaper were you? LOL > > > On 16 October 2014 10:30, Michael Falconer> wrote: > > > Yes, I agree with the general sentiment. It is not exactly the design I > > would have chosen either, but it is what exists. Design change is > occurring > > but for the moment we are stuck with the current schema. > > > > If you cannot change the schemata to be more suitable, then your > >> demonstrated queries are very near as good as it gets. I can think of > other > >> ways to achieve the same, but nothing that would be more efficient to a > >> tangible level. Maybe someone else here can come up with something a bit > >> more tight. > >> > > You are preaching to the converted I'm afraid, but I appreciate the > points > > you made. The quoted paragraph probably answers most of my questions > > really, except for this from Igor: > > > > Unless you do expect duplicates and need to eliminate them, use UNION ALL > >> - it's much cheaper (this is assuming you insist on keeping multiple > >> tables). > >> > > Cheaper because it drops the operation to remove the duplicate records > > from the result set? I'm assuming. I'll give that a try. > > > > On 16 October 2014 10:05, RSmith wrote: > > > >> > >> On 2014/10/16 00:05, Michael Falconer wrote: > >> > >>> Hi all, > >>> > >>> first off I must start with an apology. I know I'm sort of doing the > >>> wrong//... > >>> > >> > >> No need to apologise, this flies quite close to the central theme. > >> Whether you are using SQLite or any other SQL RDBMS, this is horrible DB > >> design and it is so for precisely the reasons you are asking the > question. > >> Tables should not be dynamic: The simple rule of thumb being - if you > >> cannot tell (or at least accurately predict) before-hand exactly how > many > >> tables will be in the final system, then you are doing it wrong. Why > not > >> have one set of tables and in stead of preceding each of them with a > >> , simply add a column that can host the as a simple > value, > >> which will immediately make your life very much easier and get the SQL > >> engine to do the work you are now trying to compensate for in a > lop-sided > >> manual kind of way. > >> > >> I can only imagine the amount of code you invested into your system to > >> track and deal with these dynamic sets of tables, so I know changing it > >> will seem like a rather large undertaking, but really it will mostly > >> involve removing loads of code to end up with a small set of simple > queries > >> that does all the work for you. > >> > >> The only reason I can imagine this sort of breakdown useful is if your > >> tree tables are all really really huge, like Gigabytes, and so there > might > >> be some efficiency to be gained from splitting it up, but I doubt this > is > >> the case. > >> > >> Just imagine the ease of those search queries you demonstrated... one > >> query, no UNIONs, a single set of joins and an extra where-clause > check... > >> Add to that increased efficiency at finding results (run one query in > stead > >> of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one > set > >> of tables as per your list, add column(s) for prefixing as needed, > import > >> all the tables from some large set of them (specifying additionally the > >> column for the prefix) and ten run search queries on them. I am willing > to > >> bet on the fact it will be much quicker in addition to being much > simpler. > >> > >> If you cannot change the schemata to be more suitable, then your > >> demonstrated queries are very near as good as it gets. I can think of > other > >> ways to achieve the same, but nothing that would be more efficient to a > >> tangible level. Maybe someone else here can come up with something a bit > >> more tight. > >> > >> > >> thing here as this question is NOT related to sqlite. It is a general > SQL > >>> question but I ask it here because I have great respect for the answers > >>> and > >>> discussions I have seen on this forum over many years. I rarely post > >>> myself > >>> as there are always several contributors who beat me to the answer and > >>> often their response is far better than mine would have been. I'm not a > >>> code leper, I don't need the actual SQL just the method really, though > a > >>> short code example would be well
Re: [sqlite] Search query alternatives.
Igor, Nice one. A quick test using 10 lookup trees and the same search criteria: *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)* Now with UNION ALL replacing UNION: *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)* You weren't kidding about cheaper were you? LOL On 16 October 2014 10:30, Michael Falconerwrote: > Yes, I agree with the general sentiment. It is not exactly the design I > would have chosen either, but it is what exists. Design change is occurring > but for the moment we are stuck with the current schema. > > If you cannot change the schemata to be more suitable, then your >> demonstrated queries are very near as good as it gets. I can think of other >> ways to achieve the same, but nothing that would be more efficient to a >> tangible level. Maybe someone else here can come up with something a bit >> more tight. >> > You are preaching to the converted I'm afraid, but I appreciate the points > you made. The quoted paragraph probably answers most of my questions > really, except for this from Igor: > > Unless you do expect duplicates and need to eliminate them, use UNION ALL >> - it's much cheaper (this is assuming you insist on keeping multiple >> tables). >> > Cheaper because it drops the operation to remove the duplicate records > from the result set? I'm assuming. I'll give that a try. > > On 16 October 2014 10:05, RSmith wrote: > >> >> On 2014/10/16 00:05, Michael Falconer wrote: >> >>> Hi all, >>> >>> first off I must start with an apology. I know I'm sort of doing the >>> wrong//... >>> >> >> No need to apologise, this flies quite close to the central theme. >> Whether you are using SQLite or any other SQL RDBMS, this is horrible DB >> design and it is so for precisely the reasons you are asking the question. >> Tables should not be dynamic: The simple rule of thumb being - if you >> cannot tell (or at least accurately predict) before-hand exactly how many >> tables will be in the final system, then you are doing it wrong. Why not >> have one set of tables and in stead of preceding each of them with a >> , simply add a column that can host the as a simple value, >> which will immediately make your life very much easier and get the SQL >> engine to do the work you are now trying to compensate for in a lop-sided >> manual kind of way. >> >> I can only imagine the amount of code you invested into your system to >> track and deal with these dynamic sets of tables, so I know changing it >> will seem like a rather large undertaking, but really it will mostly >> involve removing loads of code to end up with a small set of simple queries >> that does all the work for you. >> >> The only reason I can imagine this sort of breakdown useful is if your >> tree tables are all really really huge, like Gigabytes, and so there might >> be some efficiency to be gained from splitting it up, but I doubt this is >> the case. >> >> Just imagine the ease of those search queries you demonstrated... one >> query, no UNIONs, a single set of joins and an extra where-clause check... >> Add to that increased efficiency at finding results (run one query in stead >> of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one set >> of tables as per your list, add column(s) for prefixing as needed, import >> all the tables from some large set of them (specifying additionally the >> column for the prefix) and ten run search queries on them. I am willing to >> bet on the fact it will be much quicker in addition to being much simpler. >> >> If you cannot change the schemata to be more suitable, then your >> demonstrated queries are very near as good as it gets. I can think of other >> ways to achieve the same, but nothing that would be more efficient to a >> tangible level. Maybe someone else here can come up with something a bit >> more tight. >> >> >> thing here as this question is NOT related to sqlite. It is a general SQL >>> question but I ask it here because I have great respect for the answers >>> and >>> discussions I have seen on this forum over many years. I rarely post >>> myself >>> as there are always several contributors who beat me to the answer and >>> often their response is far better than mine would have been. I'm not a >>> code leper, I don't need the actual SQL just the method really, though a >>> short code example would be well received for illustration. >>> >>> It's about a search performed on multiple tables. However the structure >>> of >>> this database is somewhat unorthodox. It contains genealogical data and >>> this is clustered into trees. Each tree has 8 tables, there can be as >>> many >>> as 100 trees but most installations of the associated application >>> software >>> contain between 5 - 50 trees. These 8 tables contain a family trees data >>> and are named: >>> addresses >>> connections >>> events >>> family >>> person >>> repositories >>> sources >>> texts >>> >>> The changes and is unique for each
Re: [sqlite] Search query alternatives.
Yes, I agree with the general sentiment. It is not exactly the design I would have chosen either, but it is what exists. Design change is occurring but for the moment we are stuck with the current schema. If you cannot change the schemata to be more suitable, then your > demonstrated queries are very near as good as it gets. I can think of other > ways to achieve the same, but nothing that would be more efficient to a > tangible level. Maybe someone else here can come up with something a bit > more tight. > You are preaching to the converted I'm afraid, but I appreciate the points you made. The quoted paragraph probably answers most of my questions really, except for this from Igor: Unless you do expect duplicates and need to eliminate them, use UNION ALL - > it's much cheaper (this is assuming you insist on keeping multiple tables). > Cheaper because it drops the operation to remove the duplicate records from the result set? I'm assuming. I'll give that a try. On 16 October 2014 10:05, RSmithwrote: > > On 2014/10/16 00:05, Michael Falconer wrote: > >> Hi all, >> >> first off I must start with an apology. I know I'm sort of doing the >> wrong//... >> > > No need to apologise, this flies quite close to the central theme. Whether > you are using SQLite or any other SQL RDBMS, this is horrible DB design and > it is so for precisely the reasons you are asking the question. Tables > should not be dynamic: The simple rule of thumb being - if you cannot tell > (or at least accurately predict) before-hand exactly how many tables will > be in the final system, then you are doing it wrong. Why not have one set > of tables and in stead of preceding each of them with a , simply > add a column that can host the as a simple value, which will > immediately make your life very much easier and get the SQL engine to do > the work you are now trying to compensate for in a lop-sided manual kind of > way. > > I can only imagine the amount of code you invested into your system to > track and deal with these dynamic sets of tables, so I know changing it > will seem like a rather large undertaking, but really it will mostly > involve removing loads of code to end up with a small set of simple queries > that does all the work for you. > > The only reason I can imagine this sort of breakdown useful is if your > tree tables are all really really huge, like Gigabytes, and so there might > be some efficiency to be gained from splitting it up, but I doubt this is > the case. > > Just imagine the ease of those search queries you demonstrated... one > query, no UNIONs, a single set of joins and an extra where-clause check... > Add to that increased efficiency at finding results (run one query in stead > of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one set > of tables as per your list, add column(s) for prefixing as needed, import > all the tables from some large set of them (specifying additionally the > column for the prefix) and ten run search queries on them. I am willing to > bet on the fact it will be much quicker in addition to being much simpler. > > If you cannot change the schemata to be more suitable, then your > demonstrated queries are very near as good as it gets. I can think of other > ways to achieve the same, but nothing that would be more efficient to a > tangible level. Maybe someone else here can come up with something a bit > more tight. > > > thing here as this question is NOT related to sqlite. It is a general SQL >> question but I ask it here because I have great respect for the answers >> and >> discussions I have seen on this forum over many years. I rarely post >> myself >> as there are always several contributors who beat me to the answer and >> often their response is far better than mine would have been. I'm not a >> code leper, I don't need the actual SQL just the method really, though a >> short code example would be well received for illustration. >> >> It's about a search performed on multiple tables. However the structure of >> this database is somewhat unorthodox. It contains genealogical data and >> this is clustered into trees. Each tree has 8 tables, there can be as many >> as 100 trees but most installations of the associated application software >> contain between 5 - 50 trees. These 8 tables contain a family trees data >> and are named: >> addresses >> connections >> events >> family >> person >> repositories >> sources >> texts >> >> The changes and is unique for each tree. >> >> There are extensive search options offered to users as well as simple name >> searching and it is one of the best features of the app. It works pretty >> well, so it ain't broke in any sense, we just wonder if there is a better >> way to perform this search in SQL. Is there a general technique which is >> superior either in speed, efficiency or load bearing contexts? >> >> I am sure you can see one of the pitfalls here is the exponential growth >> of >> such a search
Re: [sqlite] Search query alternatives.
On 2014/10/16 00:05, Michael Falconer wrote: Hi all, first off I must start with an apology. I know I'm sort of doing the wrong//... No need to apologise, this flies quite close to the central theme. Whether you are using SQLite or any other SQL RDBMS, this is horrible DB design and it is so for precisely the reasons you are asking the question. Tables should not be dynamic: The simple rule of thumb being - if you cannot tell (or at least accurately predict) before-hand exactly how many tables will be in the final system, then you are doing it wrong. Why not have one set of tables and in stead of preceding each of them with a , simply add a column that can host the as a simple value, which will immediately make your life very much easier and get the SQL engine to do the work you are now trying to compensate for in a lop-sided manual kind of way. I can only imagine the amount of code you invested into your system to track and deal with these dynamic sets of tables, so I know changing it will seem like a rather large undertaking, but really it will mostly involve removing loads of code to end up with a small set of simple queries that does all the work for you. The only reason I can imagine this sort of breakdown useful is if your tree tables are all really really huge, like Gigabytes, and so there might be some efficiency to be gained from splitting it up, but I doubt this is the case. Just imagine the ease of those search queries you demonstrated... one query, no UNIONs, a single set of joins and an extra where-clause check... Add to that increased efficiency at finding results (run one query in stead of 50). Easy-mode. As a proof-of-concept, just make an SQLite DB, one set of tables as per your list, add column(s) for prefixing as needed, import all the tables from some large set of them (specifying additionally the column for the prefix) and ten run search queries on them. I am willing to bet on the fact it will be much quicker in addition to being much simpler. If you cannot change the schemata to be more suitable, then your demonstrated queries are very near as good as it gets. I can think of other ways to achieve the same, but nothing that would be more efficient to a tangible level. Maybe someone else here can come up with something a bit more tight. thing here as this question is NOT related to sqlite. It is a general SQL question but I ask it here because I have great respect for the answers and discussions I have seen on this forum over many years. I rarely post myself as there are always several contributors who beat me to the answer and often their response is far better than mine would have been. I'm not a code leper, I don't need the actual SQL just the method really, though a short code example would be well received for illustration. It's about a search performed on multiple tables. However the structure of this database is somewhat unorthodox. It contains genealogical data and this is clustered into trees. Each tree has 8 tables, there can be as many as 100 trees but most installations of the associated application software contain between 5 - 50 trees. These 8 tables contain a family trees data and are named: addresses connections events family person repositories sources texts The changes and is unique for each tree. There are extensive search options offered to users as well as simple name searching and it is one of the best features of the app. It works pretty well, so it ain't broke in any sense, we just wonder if there is a better way to perform this search in SQL. Is there a general technique which is superior either in speed, efficiency or load bearing contexts? I am sure you can see one of the pitfalls here is the exponential growth of such a search query as both total number of trees and indeed user search criteria increase. For each criteria component, and there are quite a few, the appropriate tables must be queried and results joined. Searches return records of individuals meeting the entered search criteria so the query focuses on the person table as it's anchor, performs searches on required other tables in the tree and joins the results to the person data. The results from each tree search are then UNION'ed to provide the result set. Here is a contrived example of the SQL query code which should make things clearer. The user enters simple search criteria - any person with a last name containing 'mac' and a first name containing the character 'a'. This is the resulting query (generated by php code), which searches a small 4 family tree installation. (SELECT humo1_person.*, event_kind, event_event, address_place, address_zip FROM humo1_person LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id WHERE pers_lastname LIKE '%mac%' AND (pers_firstname LIKE '%a%' OR (event_kind='name' AND event_event LIKE '%a%')
Re: [sqlite] Search query alternatives.
On 10/15/2014 6:05 PM, Michael Falconer wrote: addresses connections events family person repositories sources texts Personally, I'd have one set of tables, each with an extra column containing . Why do you need a separate set of tables, only to UNION them on every request? Instead of sharding by table, shard by row, using "prefix" column as a discriminator. (SELECT humo1_person.*, event_kind, event_event, address_place, address_zip UNION (SELECT humo2_person.*, event_kind, event_event, address_place, Unless you do expect duplicates and need to eliminate them, use UNION ALL - it's much cheaper (this is assuming you insist on keeping multiple tables). -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Search query alternatives.
Hi all, first off I must start with an apology. I know I'm sort of doing the wrong thing here as this question is NOT related to sqlite. It is a general SQL question but I ask it here because I have great respect for the answers and discussions I have seen on this forum over many years. I rarely post myself as there are always several contributors who beat me to the answer and often their response is far better than mine would have been. I'm not a code leper, I don't need the actual SQL just the method really, though a short code example would be well received for illustration. It's about a search performed on multiple tables. However the structure of this database is somewhat unorthodox. It contains genealogical data and this is clustered into trees. Each tree has 8 tables, there can be as many as 100 trees but most installations of the associated application software contain between 5 - 50 trees. These 8 tables contain a family trees data and are named: addresses connections events family person repositories sources texts The changes and is unique for each tree. There are extensive search options offered to users as well as simple name searching and it is one of the best features of the app. It works pretty well, so it ain't broke in any sense, we just wonder if there is a better way to perform this search in SQL. Is there a general technique which is superior either in speed, efficiency or load bearing contexts? I am sure you can see one of the pitfalls here is the exponential growth of such a search query as both total number of trees and indeed user search criteria increase. For each criteria component, and there are quite a few, the appropriate tables must be queried and results joined. Searches return records of individuals meeting the entered search criteria so the query focuses on the person table as it's anchor, performs searches on required other tables in the tree and joins the results to the person data. The results from each tree search are then UNION'ed to provide the result set. Here is a contrived example of the SQL query code which should make things clearer. The user enters simple search criteria - any person with a last name containing 'mac' and a first name containing the character 'a'. This is the resulting query (generated by php code), which searches a small 4 family tree installation. (SELECT humo1_person.*, event_kind, event_event, address_place, address_zip FROM humo1_person LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id WHERE pers_lastname LIKE '%mac%' AND (pers_firstname LIKE '%a%' OR (event_kind='name' AND event_event LIKE '%a%') ) GROUP BY pers_gedcomnumber ) UNION (SELECT humo2_person.*, event_kind, event_event, address_place, address_zip FROM humo2_person LEFT JOIN humo2_events ON pers_gedcomnumber=event_person_id LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_person_id WHERE pers_lastname LIKE '%mac%' AND (pers_firstname LIKE '%a%' OR (event_kind='name' AND event_event LIKE '%a%') ) GROUP BY pers_gedcomnumber ) UNION (SELECT humo3_person.*, event_kind, event_event, address_place, address_zip FROM humo3_person LEFT JOIN humo3_events ON pers_gedcomnumber=event_person_id LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_person_id WHERE pers_lastname LIKE '%mac%' AND (pers_firstname LIKE '%a%' OR (event_kind='name' AND event_event LIKE '%a%') ) GROUP BY pers_gedcomnumber ) UNION (SELECT humo4_person.*, event_kind, event_event, address_place, address_zip FROM humo4_person LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_person_id WHERE pers_lastname LIKE '%mac%' AND (pers_firstname LIKE '%a%' OR (event_kind='name' AND event_event LIKE '%a%') ) GROUP BY pers_gedcomnumber ) ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30 Any thoughts? Suggestions? Missiles? Good approach, bad approach, or completely off the grid? I do use sqlite quite a bit, but not on this particular project. -- Regards, Michael.j.Falconer. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users