> On Mar 15, 2018, at 12:33 PM, Jens Alfke <[email protected]> wrote:
> 
> I'm wondering what the best way is to efficiently search for data values that 
> can appear multiple times in a table row. SQLite indexes, even expression 
> indexes, don't directly work for this because they obviously only index one 
> value per row. Traditional relational-database design says to normalize the 
> schema by storing the multiple values in separate rows, but what if the data 
> to be indexed is JSON and you need to leave it in that form?
> 
> For example, let's say I have a table like
>       patient_id: 12345
>       temps: "[98.6, 99.1, 101.3, 100.0, 98.9]"
> and I want to run queries on temperature data, like 'patients who've had a 
> temperature above 101'. And I need better than O(n) performance.

Recognize the fact that if you’re storing data in a JSON string, to the 
database that is just one single value: a string.  The database has no 
knowledge and understanding of that value beyond the fact it is a string.  
Asking the database to index or do something with the arbitrarily constructed 
sub-values you’ve created is outside the scope of what the database can do, 
because you’re storing data in a format outside the scope of the database.

If you do want to use database functions and queries to deal with this kind of 
thing, store the data in a way the database understands it and can use it… for 
example, a table that includes “patient_id, timestamp, temp” with “temp” being 
a SINGLE numeric value.  The patient_id column can be a foreign key back the 
full patient record.  Such a design is easier to insert, update, and just about 
everything else.


> In the past my project used map/reduce to support this, essentially 
> implementing its own index system on top of SQLite tables. In this case it 
> would create a table (patient_id integer primary key, temp number) and 
> populate it by scanning the patient table. This can obviously be indexed 
> easily, but updating the table before a query when the source table has 
> changed is a pain in the butt.

Then get rid of the array and just store the values that way for everything.  
Don’t have a “source table.”  It’s a very non-relational way to store data 
anyways.

  -j


_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to