On Monday, 12 December, 2016 18:24, Vince Scafaria <vi...@dotalign.com> wrote:
 
> Our application does a lot of SQLite read/write in a background process.
> On some environments this hammers the disk I/O (e.g. Task Manager ->
> Disk). Do you have suggestions on how we might minimize this?
 
> Perhaps related, one operation that seems to use a lot of the Disk I/O is
> when we use INSERT OR REPLACE INTO. We use this as a workaround for
> performing an UPDATE together with JOIN, which is not allowed by SQLite. I
> feel as though one way we could minimize disk usage would be if we could
> use the MSSQL syntax of:
 
> UPDATE x
> SET ValueA = y.ValueA
> FROM TableOne x
> JOIN TableTwo y ON y.Id = x.TableTwoId
> WHERE COALESCE(x.ValueA,'') != COALESCE(y.ValueA,'');
 
> Instead we do:
 
> INSERT OR REPLACE INTO TableOne ( Id, TableTwoId, ValueA )
> SELECT x.Id, x.TableTwoId, y.ValueA
> FROM TableOne x
> JOIN TableTwo y ON y.Id = x.TableTwoId
> WHERE COALESCE(x.ValueA,'') != COALESCE(y.ValueA,'');
 
> Our tables all have a primary key on Id. By including Id in the statement,
> it does a REPLACE. As I understand it, replace includes a behind-the-
> scenes DELETE which triggers a re-indexing. Maybe that is part of the
> problem, but I don't see a workaround. Ideas?

I do not see any need for INSERT, you are doing a simple update.  Since your 
select is constrained by existing rows in TableOne (which you are updating), 
you only need to update, not delete and insert (which you are doing for every 
row).  How about something like:

UPDATE TableOne
   SET ValueA = (SELECT ValueA
                   FROM TableTwo
                  WHERE TableTwo.Id = TableOne.TableTwoID)
 WHERE COALESCE(TableOne.ValueA, '') != COALESCE((SELECT ValueA
                                                    FROM TableTwo
                                                   WHERE TableTwo.Id = 
TableOne.TableTwoID). '')

Note that if your database were properly normalized, you would not have to do 
this (that is, ValueA is entirely dependant on TableTwoID, and TableTwoID is a 
foreign key into TableTwo).  If you were not duplicating data then this issue 
would not arise.  This is part of the problem with having multiple datastores 
-- you never know which one is accurate.  And if TableOne.ValueA is merely 
duplicated data (and therefore always to be assumed to be incorrect), then you 
may as well not have it in TableOne just do a lookup on the fly.  If you think 
that makes your queries look too complicated, then you can define a view which 
does it for you and only select through the view.

If you really must de-normalize your data for some reason, then you should put 
a trigger on TableOne (and perhaps TableTwo) that keep them in sync when 
updated (or inserted into) so that you do not have to run updates such as the 
above.




_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to