Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-12 Thread Yue
Hello,
Thanks for your advice!
It works pretty well when I dropped the rowid and set the other field as
primary key.
I also enable wal mode at the cost of a little bit read performance.
Overall, it works really well now.

Really appreciate your help.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-12 Thread Yue
Thanks for the advice!

I could never thought about this. Really learnt a lot.

It's a customized Android phone type device.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-11 Thread Eric Grange
> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE
DS = 15;".

Not zipvfs specific, but that kind of update can be quite inefficient if
the record is large (in terms of bytes, or records per database page) as
your table declaration hints to.

This will be especially prominent with compression, as a single byte
changed early in the uncompressed stream can lead to changing pretty much
every byte of the compressed data, and so is bound to defeat any
optimizations the storage could have to handle unmodified pages and data.

Depending on your update patterns and frequencies, it could thus be
beneficial to split your table into two tables (or more) with a common
primary key:
- one table for the rarely changed fields (even better if they are
immutable)
- one table (or more) for the fields that change often

This will increase the database size and reduce insert performance, and it
can complicate querying when you need everything (extra inner joins), but
it can help reduce the amount of raw data that is affected by updates quite
drastically.

I have been able to use the above strategy sucessfully, however it was
under Windows compressed folders, so YMMV.


On Thu, Sep 7, 2017 at 9:13 PM, Yue Wu  wrote:

> Hello,
> As mentioned in the subject, our goal is to improve performance regarding
> to batch sql updates.
> The update sql as follow,
>
> > UPDATE ITEM SET FIELD4 =? WHERE DS=?
>
> We run 100,000 updates in a single transaction. The zipvfs version takes
> about 20 min while uncompressed version takes about 7 min.
> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS
> = 15;".
>
> Any suggestions is very helpful.
>
> Some background:
> We have an app running on Android 4.2 using zipvfs.
> The page size is 4096
> The cache size is - 2000
>
> Table schema for ITEM
>
> > CREATE TABLE ITEM (
> > FIELD0 NUMERIC,
> > FIELD1 NUMERIC,
> > DS TEXT,
> > FIELD2 TEXT,
> > FIELD3 TEXT,
> > FIELD4 NUMERIC,
> > FIELD5 NUMERIC,
> > FIELD6 NUMERIC,
> > FIELD7 NUMERIC,
> > FIELD8 NUMERIC,
> > FIELD9 NUMERIC,
> > FIELD10 NUMERIC,
> > FIELD11 TEXT);
>
>
> The third column: "DS" is what we query by almost all the time. We also
> created index:
>
> > CREATE INDEX DS_INDEX ON ITEM(DS);
>
>
> There are 30 million records. Zipvfs size of 471mb vs uncompressed db (the
> one shipped with Android 4.2) size of 1.39gb.
> Zipvfs db using zlib and aes128, which are default.
>
> Thanks
> --
>
> Yue Wu |  Android Developer
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-09 Thread Simon Slavin


On 9 Sep 2017, at 9:11pm, Teg  wrote:

> Is  it a phone or tablet?  I'd wonder if the CPU is going into thermal
> throttle  mode.  As I understand it most phones and tablets can't keep
> the CPU's running full speed without eventually throttling the CPU.

I would never have thought of that unless I had the device in my hands.  Well 
done.

Yes.  You’re trying to process 100,000 updates, and your device is working flat 
out for 7 minutes, or for 20 minutes.  How hot is it ?  If you do it unplugged, 
how much does that drain the battery ?

I don’t have experience with Android but I can tell you that iOS will have to 
throttle any process that attempts to do process data continuously for 7 
minutes.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-09 Thread Teg
Hello Yue,

Is  it a phone or tablet?  I'd wonder if the CPU is going into thermal
throttle  mode.  As I understand it most phones and tablets can't keep
the CPU's running full speed without eventually throttling the CPU.

To me the fact uncompressed, while moving much more data, is more than 2
times faster suggests the bottleneck isn't the fileIO.

Thursday, September 7, 2017, 3:13:04 PM, you wrote:

YW> Hello,
YW> As mentioned in the subject, our goal is to improve performance regarding
YW> to batch sql updates.
YW> The update sql as follow,

>> UPDATE ITEM SET FIELD4 =? WHERE DS=?

YW> We run 100,000 updates in a single transaction. The zipvfs version takes
YW> about 20 min while uncompressed version takes about 7 min.
YW> I've also attached results of "EXPLAIN UPDATE ITEM SET FIELD4 = 10 WHERE DS
YW> = 15;".

YW> Any suggestions is very helpful.

YW> Some background:
YW> We have an app running on Android 4.2 using zipvfs.
YW> The page size is 4096
YW> The cache size is - 2000

YW> Table schema for ITEM

>> CREATE TABLE ITEM (
>> FIELD0 NUMERIC,
>> FIELD1 NUMERIC,
>> DS TEXT,
>> FIELD2 TEXT,
>> FIELD3 TEXT,
>> FIELD4 NUMERIC,
>> FIELD5 NUMERIC,
>> FIELD6 NUMERIC,
>> FIELD7 NUMERIC,
>> FIELD8 NUMERIC,
>> FIELD9 NUMERIC,
>> FIELD10 NUMERIC,
>> FIELD11 TEXT);


YW> The third column: "DS" is what we query by almost all the time. We also
YW> created index:

>> CREATE INDEX DS_INDEX ON ITEM(DS);


YW> There are 30 million records. Zipvfs size of 471mb vs uncompressed db (the
YW> one shipped with Android 4.2) size of 1.39gb.
YW> Zipvfs db using zlib and aes128, which are default.

YW> Thanks



-- 
 Tegmailto:t...@djii.com

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-09 Thread Yue
Thanks for the advice. I'll try them now.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-08 Thread Barry Smith
Are your updates sorted by DS? If your queries are sorted then sequential 
queries are more likely to hit the same db pages while searching the index, 
resulting in higher cache usage and fewer decompression operations. This would 
have less benefit if your 100k DS values of the updates are randomly 
distributed through the 30m available, and more of an effect if they're tightly 
clustered in some range.

If you mainly query by DS, is that the same as saying it is your primary key? 
If so, have you tried to declare it so and try the table WITHOUT ROWID, and 
don't bother with the index? It may help since you wouldn't have to decompress 
both the index pages and the table pages.

> On 8 Sep 2017, at 12:33 pm, Dominique Pellé  wrote:
> 
> Yue Wu  wrote:
> 
>> Hello,
>> As mentioned in the subject, our goal is to improve performance regarding
>> to batch sql updates.
>> The update sql as follow,
>> 
>>> UPDATE ITEM SET FIELD4 =? WHERE DS=?
>> 
>> We run 100,000 updates in a single transaction. The zipvfs version takes
>> about 20 min while uncompressed version takes about 7 min.
> 
> Which compression algorithm do you use with zipvfs?
> Try LZ4, it's is times faster than zlib at compression and
> decompression, at the cost of compressing a bit less:
> 
> https://github.com/lz4/lz4
> 
> Or try zstd, which can compress more than zlib and faster (especially
> on 64-bit architectures), but it's not as fast as LZ4:
> 
> http://facebook.github.io/zstd/
> 
> What is also your SQLite page size?
> Reducing the SQLite page size probably helps to speed up
> updates, since zipvfs compresses by pages.
> 
> Dominique
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-08 Thread Dominique Pellé
Yue Wu  wrote:

> Hello,
> As mentioned in the subject, our goal is to improve performance regarding
> to batch sql updates.
> The update sql as follow,
>
>> UPDATE ITEM SET FIELD4 =? WHERE DS=?
>
> We run 100,000 updates in a single transaction. The zipvfs version takes
> about 20 min while uncompressed version takes about 7 min.

Which compression algorithm do you use with zipvfs?
Try LZ4, it's is times faster than zlib at compression and
decompression, at the cost of compressing a bit less:

https://github.com/lz4/lz4

Or try zstd, which can compress more than zlib and faster (especially
on 64-bit architectures), but it's not as fast as LZ4:

http://facebook.github.io/zstd/

What is also your SQLite page size?
Reducing the SQLite page size probably helps to speed up
updates, since zipvfs compresses by pages.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users