On Monday, 12 December, 2016 18:24, Vince Scafaria <vi...@dotalign.com> wrote: > Our application does a lot of SQLite read/write in a background process. > On some environments this hammers the disk I/O (e.g. Task Manager -> > Disk). Do you have suggestions on how we might minimize this? > Perhaps related, one operation that seems to use a lot of the Disk I/O is > when we use INSERT OR REPLACE INTO. We use this as a workaround for > performing an UPDATE together with JOIN, which is not allowed by SQLite. I > feel as though one way we could minimize disk usage would be if we could > use the MSSQL syntax of: > UPDATE x > SET ValueA = y.ValueA > FROM TableOne x > JOIN TableTwo y ON y.Id = x.TableTwoId > WHERE COALESCE(x.ValueA,'') != COALESCE(y.ValueA,''); > Instead we do: > INSERT OR REPLACE INTO TableOne ( Id, TableTwoId, ValueA ) > SELECT x.Id, x.TableTwoId, y.ValueA > FROM TableOne x > JOIN TableTwo y ON y.Id = x.TableTwoId > WHERE COALESCE(x.ValueA,'') != COALESCE(y.ValueA,''); > Our tables all have a primary key on Id. By including Id in the statement, > it does a REPLACE. As I understand it, replace includes a behind-the- > scenes DELETE which triggers a re-indexing. Maybe that is part of the > problem, but I don't see a workaround. Ideas?
I do not see any need for INSERT, you are doing a simple update. Since your select is constrained by existing rows in TableOne (which you are updating), you only need to update, not delete and insert (which you are doing for every row). How about something like: UPDATE TableOne SET ValueA = (SELECT ValueA FROM TableTwo WHERE TableTwo.Id = TableOne.TableTwoID) WHERE COALESCE(TableOne.ValueA, '') != COALESCE((SELECT ValueA FROM TableTwo WHERE TableTwo.Id = TableOne.TableTwoID). '') Note that if your database were properly normalized, you would not have to do this (that is, ValueA is entirely dependant on TableTwoID, and TableTwoID is a foreign key into TableTwo). If you were not duplicating data then this issue would not arise. This is part of the problem with having multiple datastores -- you never know which one is accurate. And if TableOne.ValueA is merely duplicated data (and therefore always to be assumed to be incorrect), then you may as well not have it in TableOne just do a lookup on the fly. If you think that makes your queries look too complicated, then you can define a view which does it for you and only select through the view. If you really must de-normalize your data for some reason, then you should put a trigger on TableOne (and perhaps TableTwo) that keep them in sync when updated (or inserted into) so that you do not have to run updates such as the above. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users