> On Mar 15, 2018, at 12:33 PM, Jens Alfke <[email protected]> wrote: > > I'm wondering what the best way is to efficiently search for data values that > can appear multiple times in a table row. SQLite indexes, even expression > indexes, don't directly work for this because they obviously only index one > value per row. Traditional relational-database design says to normalize the > schema by storing the multiple values in separate rows, but what if the data > to be indexed is JSON and you need to leave it in that form? > > For example, let's say I have a table like > patient_id: 12345 > temps: "[98.6, 99.1, 101.3, 100.0, 98.9]" > and I want to run queries on temperature data, like 'patients who've had a > temperature above 101'. And I need better than O(n) performance.
Recognize the fact that if you’re storing data in a JSON string, to the database that is just one single value: a string. The database has no knowledge and understanding of that value beyond the fact it is a string. Asking the database to index or do something with the arbitrarily constructed sub-values you’ve created is outside the scope of what the database can do, because you’re storing data in a format outside the scope of the database. If you do want to use database functions and queries to deal with this kind of thing, store the data in a way the database understands it and can use it… for example, a table that includes “patient_id, timestamp, temp” with “temp” being a SINGLE numeric value. The patient_id column can be a foreign key back the full patient record. Such a design is easier to insert, update, and just about everything else. > In the past my project used map/reduce to support this, essentially > implementing its own index system on top of SQLite tables. In this case it > would create a table (patient_id integer primary key, temp number) and > populate it by scanning the patient table. This can obviously be indexed > easily, but updating the table before a query when the source table has > changed is a pain in the butt. Then get rid of the array and just store the values that way for everything. Don’t have a “source table.” It’s a very non-relational way to store data anyways. -j _______________________________________________ sqlite-users mailing list [email protected] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

