Re: [sqlite] Most efficient way to detect on-disk change

2017-11-08 Thread Wout Mertens
Thank you all! I really have to do a thorough read of all the pragmas,
there are so many useful things in there! The user version sounds exactly
like what I should be using for storing the db version, and presumably the
data_version is a little faster still than reading the user version.

@Keith, the reason I'm caching is because the data itself is reasonably
small, and while I can query + parse the JSON in <2ms, using cached data
from memory takes microseconds…

Besides, memory is relatively cheap these days, and in general (safe)
caching is beneficial. I like reading about
https://en.wikipedia.org/wiki/Cache-oblivious_algorithm - any amount of
cache can improve performance with these…

On Wed, Nov 8, 2017 at 12:22 PM Dan Kennedy  wrote:

> On 11/08/2017 03:55 PM, Dominique Devienne wrote:
> > On Wed, Nov 8, 2017 at 7:45 AM, Dan Kennedy 
> wrote:
> >
> >> On 7 Nov 2017, at 6:53pm, David Raymond 
> wrote:
> >>> I think pragma data_version is what you're looking for.
>  http://www.sqlite.org/pragma.html#pragma_data_version
> 
> >> I think it's the opposite. For connection A, the value of "PRAGMA
> >> data_version" does not change as a result of commits by connection A. It
> >> changes if the db is modified by any other connection, regardless of
> >> whether or not that other connection resides in a different process or
> not.
> >>
> >> "The integer values returned by two invocations of "PRAGMA data_version"
> >> from the same connection will be different if changes were committed to
> the
> >> database by any other connection in the interim. The "PRAGMA
> data_version"
> >> value is unchanged for commits made on the same database connection."
> >
> > Hi Dan. So you confirm David's answer, provided OP also tracks change
> made
> > by the local connection, in addition to tracking pragma data_version?
>
> That's right.
>
> The original use case was an application-level cache of objects
> associated with a single database connection. The cache should be
> invalidated whenever the database is written. So the app would:
>
>a) invalidate the cache whenever it wrote to the db, and
>b) checked that "PRAGMA data_version" has not changed before using an
> object from the cache (and invalidating the entire cache it if it had).
>
> I guess the logic was that the app could implement more fine-grained
> cache invalidation in (a) if required.
>
> Dan.
>
>
>
>
>
>
>
> > I just want to make sure I understand your answer correctly. Thanks, --DD
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-08 Thread Dan Kennedy

On 11/08/2017 03:55 PM, Dominique Devienne wrote:

On Wed, Nov 8, 2017 at 7:45 AM, Dan Kennedy  wrote:


On 7 Nov 2017, at 6:53pm, David Raymond  wrote:

I think pragma data_version is what you're looking for.

http://www.sqlite.org/pragma.html#pragma_data_version


I think it's the opposite. For connection A, the value of "PRAGMA
data_version" does not change as a result of commits by connection A. It
changes if the db is modified by any other connection, regardless of
whether or not that other connection resides in a different process or not.

"The integer values returned by two invocations of "PRAGMA data_version"
from the same connection will be different if changes were committed to the
database by any other connection in the interim. The "PRAGMA data_version"
value is unchanged for commits made on the same database connection."


Hi Dan. So you confirm David's answer, provided OP also tracks change made
by the local connection, in addition to tracking pragma data_version?


That's right.

The original use case was an application-level cache of objects 
associated with a single database connection. The cache should be 
invalidated whenever the database is written. So the app would:


  a) invalidate the cache whenever it wrote to the db, and
  b) checked that "PRAGMA data_version" has not changed before using an 
object from the cache (and invalidating the entire cache it if it had).


I guess the logic was that the app could implement more fine-grained 
cache invalidation in (a) if required.


Dan.








I just want to make sure I understand your answer correctly. Thanks, --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-08 Thread Dominique Devienne
On Wed, Nov 8, 2017 at 7:45 AM, Dan Kennedy  wrote:

> On 7 Nov 2017, at 6:53pm, David Raymond  wrote:
>>
>> I think pragma data_version is what you're looking for.
>>> http://www.sqlite.org/pragma.html#pragma_data_version
>>>
>>
> I think it's the opposite. For connection A, the value of "PRAGMA
> data_version" does not change as a result of commits by connection A. It
> changes if the db is modified by any other connection, regardless of
> whether or not that other connection resides in a different process or not.
>
> "The integer values returned by two invocations of "PRAGMA data_version"
> from the same connection will be different if changes were committed to the
> database by any other connection in the interim. The "PRAGMA data_version"
> value is unchanged for commits made on the same database connection."


Hi Dan. So you confirm David's answer, provided OP also tracks change made
by the local connection, in addition to tracking pragma data_version?
I just want to make sure I understand your answer correctly. Thanks, --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Dan Kennedy

On 11/08/2017 02:41 AM, Simon Slavin wrote:


On 7 Nov 2017, at 6:53pm, David Raymond  wrote:


I think pragma data_version is what you're looking for.
http://www.sqlite.org/pragma.html#pragma_data_version

An excellent solution, though there’s a /caveat/.  From the original post:


Right now everything's a single process, so it's really easy, just clear
the cache on every write. However, I want to be prepared for the near
future where I will have multiple processes using this db file.

Documentation for the PRAGMA says

" The "PRAGMA data_version" value is a local property of each database connection and so 
values returned by two concurrent invocations of "PRAGMA data_version" on separate database 
connections are often different even though the underlying database is identical."

So when you convert your code to use multiple processes, they must all use the 
same connection for this to work properly.  That means you will have to invent 
a method to ensure that only one of them makes changes at a time.  That might 
not be what you wanted to do.



I think it's the opposite. For connection A, the value of "PRAGMA 
data_version" does not change as a result of commits by connection A. It 
changes if the db is modified by any other connection, regardless of 
whether or not that other connection resides in a different process or not.


"The integer values returned by two invocations of "PRAGMA data_version" 
from the same connection will be different if changes were committed to 
the database by any other connection in the interim. The "PRAGMA 
data_version" value is unchanged for commits made on the same database 
connection."


Dan.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Simon Slavin


On 7 Nov 2017, at 11:12pm, Jens Alfke  wrote:

> On Nov 7, 2017, at 2:39 PM, Keith Medcalf  wrote:
> 
>> What advantage does your third-level (application) cache provide that is not 
>> provided by the two lower level caches?
> 
> You’re being presumptuous here. It’s pretty common for presentation-level 
> data to be very expensive to recompute; a common example is partial or full 
> web pages. Rendering image, like graphs, can also be expensive. Or hey, how 
> about the results of a complex SQL query?

It’s caching at two different levels.  The level Keith is talking about happens 
when you submit a SQL query and cache the data retrieved.  The level Jens is 
talking about happens when you submit a SQL query, process the data, then cache 
the results of the processing.

And it’s all highly dependent on the hardware, the amount of data and the type 
of processing.  And we don’t know whether the OP did premature optimisation.  
So let’s forget this threadlet until and unless we have relevant data.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Jens Alfke


> On Nov 7, 2017, at 2:39 PM, Keith Medcalf  wrote:
> 
> What advantage does your third-level (application) cache provide that is not 
> provided by the two lower level caches?

You’re being presumptuous here. It’s pretty common for presentation-level data 
to be very expensive to recompute; a common example is partial or full web 
pages. Rendering image, like graphs, can also be expensive. Or hey, how about 
the results of a complex SQL query?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Eduardo Morras
On Tue, 07 Nov 2017 18:07:42 +
Wout Mertens  wrote:

> I'm working with a db that's only written to in transations, and each
> transaction increases a db-global version counter.
> 
> This means that I can cache all reads, unless the version changed.
> 
> What would be the most efficient way to make sure I *never* serve
> stale data?
> 
> Right now everything's a single process, so it's really easy, just
> clear the cache on every write. However, I want to be prepared for
> the near future where I will have multiple processes using this db
> file.
> 
> I'm thinking that to detect writes, this might be a safe approach:
> 
> Before serving any cached read, check the timestamp on the wal file.
> If it changed, read the global version. If it changed, clear the
> cache. Otherwise, serve the cached read.
> 
> Is it safe to assume that all writes would mean change of the wal file
> timestamp?
> More importantly, is it faster to check the timestamp or would a
> prepared query for the version actually be faster (and safer)?
> 
> Also, I'm using WAL right now, but I wonder if that's really useful
> given the single-writer-at-a-time?

You can define triggers on insert, update and delete that fires a user defined 
function that warns your other threads or an external process (I use a similar 
setup on a AS400/DB2). 
Something like this:

CREATE TRIGGER tg_night_watcher_insert AFTER INSERT ON table_name_to_watch_up
 BEGIN
  SELECT your_nigth_watcher();
 END

HTH

> 
> Thank you for your insights,
> 
> Wout.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread J Decker
in linux inotify - http://man7.org/linux/man-pages/man7/inotify.7.html
in windows  FindFirstChangeNotification/FindNextChangeNotification
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365261(v=vs.85).aspx

if you wait for an actual change before checking to see if there really was
a change
and can prevent arbitrary polling

On Tue, Nov 7, 2017 at 10:07 AM, Wout Mertens 
wrote:

> I'm working with a db that's only written to in transations, and each
> transaction increases a db-global version counter.
>
> This means that I can cache all reads, unless the version changed.
>
> What would be the most efficient way to make sure I *never* serve stale
> data?
>
> Right now everything's a single process, so it's really easy, just clear
> the cache on every write. However, I want to be prepared for the near
> future where I will have multiple processes using this db file.
>
> I'm thinking that to detect writes, this might be a safe approach:
>
> Before serving any cached read, check the timestamp on the wal file. If it
> changed, read the global version. If it changed, clear the cache.
> Otherwise, serve the cached read.
>
> Is it safe to assume that all writes would mean change of the wal file
> timestamp?
> More importantly, is it faster to check the timestamp or would a prepared
> query for the version actually be faster (and safer)?
>
> Also, I'm using WAL right now, but I wonder if that's really useful given
> the single-writer-at-a-time?
>
> Thank you for your insights,
>
> Wout.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Keith Medcalf

So you are caching data at the application level that is cached at the database 
page cache level which is cached in the Operating System file cache that lives 
in a file residing on disk -- effectively storing three copies of the data in 
memory.

What advantage does your third-level (application) cache provide that is not 
provided by the two lower level caches?  In other words if the data is already 
stored in the page cache and (that which isn't) is already in the OS file cache 
(which it must be since you read the data once already), what benefit does the 
third-level cache provide other than add the overhead and complication of its 
management?

Likely the most "efficient" way to ensure you do not serve stale data is to get 
rid of the application level caching and simply re-retrieve the data when you 
need it.  Unless of course you are seriously memory constrained such at running 
this on a model 5150.  Or your query's are very long and complicated.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Wout Mertens
>Sent: Tuesday, 7 November, 2017 11:08
>To: SQLite mailing list
>Subject: [sqlite] Most efficient way to detect on-disk change
>
>I'm working with a db that's only written to in transations, and each
>transaction increases a db-global version counter.
>
>This means that I can cache all reads, unless the version changed.
>
>What would be the most efficient way to make sure I *never* serve
>stale
>data?
>
>Right now everything's a single process, so it's really easy, just
>clear
>the cache on every write. However, I want to be prepared for the near
>future where I will have multiple processes using this db file.
>
>I'm thinking that to detect writes, this might be a safe approach:
>
>Before serving any cached read, check the timestamp on the wal file.
>If it
>changed, read the global version. If it changed, clear the cache.
>Otherwise, serve the cached read.
>
>Is it safe to assume that all writes would mean change of the wal
>file
>timestamp?
>More importantly, is it faster to check the timestamp or would a
>prepared
>query for the version actually be faster (and safer)?
>
>Also, I'm using WAL right now, but I wonder if that's really useful
>given
>the single-writer-at-a-time?
>
>Thank you for your insights,
>
>Wout.
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Jens Alfke


> On Nov 7, 2017, at 11:41 AM, Simon Slavin  wrote:
> 
> So when you convert your code to use multiple processes, they must all use 
> the same connection for this to work properly. 

No; it just means each process will track its own data-version value based on 
its own connection. It doesn’t matter if those values are different in 
different processes.

(In any case, it’s impossible for multiple processes to [directly] use the same 
SQLite connection, since a connection is an in-memory object.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread Simon Slavin


On 7 Nov 2017, at 6:53pm, David Raymond  wrote:

> I think pragma data_version is what you're looking for.
> http://www.sqlite.org/pragma.html#pragma_data_version

An excellent solution, though there’s a /caveat/.  From the original post:

> Right now everything's a single process, so it's really easy, just clear
> the cache on every write. However, I want to be prepared for the near
> future where I will have multiple processes using this db file.

Documentation for the PRAGMA says

" The "PRAGMA data_version" value is a local property of each database 
connection and so values returned by two concurrent invocations of "PRAGMA 
data_version" on separate database connections are often different even though 
the underlying database is identical. "

So when you convert your code to use multiple processes, they must all use the 
same connection for this to work properly.  That means you will have to invent 
a method to ensure that only one of them makes changes at a time.  That might 
not be what you wanted to do.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Most efficient way to detect on-disk change

2017-11-07 Thread David Raymond
I think pragma data_version is what you're looking for.
http://www.sqlite.org/pragma.html#pragma_data_version


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Wout Mertens
Sent: Tuesday, November 07, 2017 1:08 PM
To: SQLite mailing list
Subject: [sqlite] Most efficient way to detect on-disk change

I'm working with a db that's only written to in transations, and each
transaction increases a db-global version counter.

This means that I can cache all reads, unless the version changed.

What would be the most efficient way to make sure I *never* serve stale
data?

Right now everything's a single process, so it's really easy, just clear
the cache on every write. However, I want to be prepared for the near
future where I will have multiple processes using this db file.

I'm thinking that to detect writes, this might be a safe approach:

Before serving any cached read, check the timestamp on the wal file. If it
changed, read the global version. If it changed, clear the cache.
Otherwise, serve the cached read.

Is it safe to assume that all writes would mean change of the wal file
timestamp?
More importantly, is it faster to check the timestamp or would a prepared
query for the version actually be faster (and safer)?

Also, I'm using WAL right now, but I wonder if that's really useful given
the single-writer-at-a-time?

Thank you for your insights,

Wout.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users