Are your updates sorted by DS? If your queries are sorted then sequential
queries are more likely to hit the same db pages while searching the index,
resulting in higher cache usage and fewer decompression operations. This would
have less benefit if your 100k DS values of the updates are randomly
distributed through the 30m available, and more of an effect if they're tightly
clustered in some range.
If you mainly query by DS, is that the same as saying it is your primary key?
If so, have you tried to declare it so and try the table WITHOUT ROWID, and
don't bother with the index? It may help since you wouldn't have to decompress
both the index pages and the table pages.
> On 8 Sep 2017, at 12:33 pm, Dominique Pellé <dominique.pe...@gmail.com> wrote:
> Yue Wu <yue...@datascan.com> wrote:
>> As mentioned in the subject, our goal is to improve performance regarding
>> to batch sql updates.
>> The update sql as follow,
>>> UPDATE ITEM SET FIELD4 =? WHERE DS=?
>> We run 100,000 updates in a single transaction. The zipvfs version takes
>> about 20 min while uncompressed version takes about 7 min.
> Which compression algorithm do you use with zipvfs?
> Try LZ4, it's is times faster than zlib at compression and
> decompression, at the cost of compressing a bit less:
> Or try zstd, which can compress more than zlib and faster (especially
> on 64-bit architectures), but it's not as fast as LZ4:
> What is also your SQLite page size?
> Reducing the SQLite page size probably helps to speed up
> updates, since zipvfs compresses by pages.
> sqlite-users mailing list
sqlite-users mailing list