Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Keith Medcalf

The standard computer answer applies:  The fastest way to perform I/O is not to 
do it (and contrary to recent press, this discovery was not made by Microsoft 
Research a couplke of weeks ago but was "discovered" in the 1950's when, on 
average, a good secretary could find a file in the filing cabinets faster than 
the computer could find the same record in several miles of magnetic tape).

You achieve this by using a disk cache -- preferably a block-cache, however 
this technology has gone by the wayside and periodically "discovered" to be how 
to run a good cache every few years.  It dates back to (once again) the 60's 
and 70's.  Modern crap uses filesystem based caching which is ill-conceived and 
usually more-or-less totally brain-dead.

In any case, your cache size should be tuned so that you average a 90% hit rate 
or better (which with a properly designed block cache is not that hard to 
achieve or very large).  Or, in these days of humongous amounts of medium speed 
(dynamic) RAM, the cache should use all space not otherwise being used for code 
and data working set (if you bought memory and it is "free" as in unused, you 
flushed your money down the toilet).

You do not want to use shared cache.  Shared cache is designed for use in 
really itty bitty bitty bitty bitty boxes where memory is measured in bytes.  
(phones, watches, TVs, hand-held TV remote controls, etc).  If you are using 
something that qualifies as a "computer" then you do not want a shared cache.

Of course bear in mind process memory limits (you may not be able to use more 
than 256MB or 512MB total cache per process on a 32-bit computer that only 
allocates 2 GB of virtual address space per process) and also the fact that 
just because you "said" to use 4 TB of RAM as cache does not mean that 4 TB 
will be used.  A 1 GB file will use a maximum of 1 or perhaps 2 GB of cache 
(depending on your operations) even if you tell to use 4 TB of cache.

So, the optimal answer is often "as much as possible without impacting the 
multiprogramming ratio, especially on Operating Systems of brain-dead design 
from the get-go which favour bad uses of memory over good ones).

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Mark Hamburg
> Sent: Tuesday, 29 November, 2016 11:39
> To: SQLite mailing list
> Subject: Re: [sqlite] Read-only access which does not block writers
> 
> One other question about WAL mode and simultaneous readers and writers:
> How are people setting their page caches? My read is that shared cache is
> probably not what's wanted. I was setting my reader caches to be bigger
> than my writer cache under the assumption that writers write and then move
> on whereas readers would benefit from having more data cached, but I'm now
> thinking that the disk cache should be getting me the latter effect and
> increasing the size of the write cache should allow the writer to run
> longer without having to flush from memory to disk. Is there any standard
> advice in this regard or is this the sort of question where the answer is
> "experiment".
> 
> Mark
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread James K. Lowden
On Thu, 24 Nov 2016 22:59:32 +0100
Florian Weimer  wrote:

> Does LMDB perform lock-free optimistic reads and 
> retroactively verifies that the entire read operation was consistent? 

In LMDB there are readers and reader-writers.  A reader never writes; a
reader-writer may read, but that read counts as a "write" for purposes
of isolation.  

Each actor -- reader or reader-writer -- is isolated from all others.
There are no write-after-read errors or other SQL isolation anomalies
because there is only ever at most 1 transaction in existence capable
of writing.  Any call to begin a second writable transaction blocks
until the first one completes.  

LMDB uses MVCC to provide each reader with a database snapshot.  If a
writer modifies a row that a reader sees, that modification
happens "elsewhere" to a version that the writer sees and the reader
does not.  After the writer commits and no transactions remain
that refer to the superseded version, it ceases to exist.  New readers
thenceforward see the modified version.  

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Simon Slavin

On 29 Nov 2016, at 6:10pm, Mark Hamburg  wrote:

> In the single WAL scenario, what I probably really want is a way for the 
> checkpoint operation on the write connection to do its work then wait for 
> exclusive access — standard reader/writer lock pattern — to do the WAL reset. 
> This would presumably limit the time that the readers were blocked since I 
> would expect the WAL reset to be reasonably quick if the checkpoint copying 
> work were already complete.

Remember that SQLite has no server or background processes.  All your processes 
are messing with the database file in real time while the SQLite API calls are 
executing.

> Furthermore, the write operation would only be blocked for the length of the 
> longest outstanding read so as long I favor big transactions for writes and 
> short queries for reads, the writer shouldn't be blocked for too long either. 
> Are there checkpoint settings that achieve this or do I need to build that 
> logic into my code?

You can use the SQL command "BEGIN IMMEDIATE".  That’s quite standard and is no 
problem.  You can also use "sqlite3_db_mutex()".  This is non-standard and you 
need to understand exactly what you’re doing.

But given how good SQLite is at this stuff itself (really really good) you’re 
unlikely to be able to improve on SQLite’s default behaviour.  You can spend 
100 hours programming and end up with something no better than if you’d done 
nothing.  So I strongly recommend you leave stuff alone and let SQLite do it’s 
thing.  But if you really want to mess with it, that’s how.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Jens Alfke

> On Nov 29, 2016, at 9:09 AM, Simon Slavin  wrote:
> 
> You cannot design a system which (A) provides up-to-date data to readers (B) 
> allows writers to get rid of their data immediately without ever locking up 
> and (C) guarantees that earlier changes to the data are ’saved' before later 
> changes, thus preserving uncorrupted data in the case of power-cuts, etc..

In the general case of multiple writers I would agree, but I think Mark said 
he’s only got one writer. (Or at least the problem he described is independent 
of the number of writers.) And the problem he’s describing is not one of 
correctness but of performance, i.e. optimizing file size by checkpointing the 
WAL.

> You might like to read more about Brewer’s Theorem:
> 

The CAP theorem applies to distributed systems; I don’t see how it’s relevant 
here?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Scott Hess
On Tue, Nov 29, 2016 at 10:10 AM, Mark Hamburg  wrote:
> On Nov 29, 2016, at 9:09 AM, Simon Slavin  wrote:
>>> On 29 Nov 2016, at 4:18pm, Mark Hamburg  wrote:
>>>
>>> Does this make sense? Does it seem useful? (It seems useful to me when I 
>>> see multi-megabyte WAL files.)
>>
>> Sorry, but I cannot spare the time right now to analyze the system you laid 
>> out.  It usually takes half an hour to diagram out the read and write 
>> procedures and point out where multiple simultaneous ones don’t fit together.
>>
>> I can tell you that entire books are written about the difficulties of 
>> simultaneous access to a database, and that I’ve read too many of them.  And 
>> that it has been proven many times that there’s no solution.  You cannot 
>> design a system which (A) provides up-to-date data to readers (B) allows 
>> writers to get rid of their data immediately without ever locking up and (C) 
>> guarantees that earlier changes to the data are ’saved' before later 
>> changes, thus preserving uncorrupted data in the case of power-cuts, etc..
>>
>> It is possible to implement a version if you drop one of the requirements.  
>> For example, you can have many simultaneous writers as long as you don’t 
>> need any readers.  Or you can have many readers as long as you have only one 
>> writer and you don’t need readers to be completely up-to-date.
>>
>> You might like to read more about Brewer’s Theorem:
>>
>> 
>>
>> In the meantime, I’m glad that WAL mode seems to be useful for you, if you 
>> can cope with big journal files until all connections are closed, it’s a 
>> good solution.
>
> What I probably haven't accounted for is what it would take to do an 
> atomic/safe swap of the WAL files in my double WAL scenario. I need to give 
> that some more thought.

Don't do swaps.  The issue is that WAL needs everyone to agree to
reset to the front of the WAL file.  The _A and _B files could have a
serial number to differentiate which comes first.  Then when the first
file has no readers blocking, that file can be checkpointed to the
main database.  Then that file becomes available for future use.

-scott
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Mark Hamburg
On Nov 29, 2016, at 9:09 AM, Simon Slavin  wrote:
> 
> 
>> On 29 Nov 2016, at 4:18pm, Mark Hamburg  wrote:
>> 
>> Does this make sense? Does it seem useful? (It seems useful to me when I see 
>> multi-megabyte WAL files.)
> 
> Sorry, but I cannot spare the time right now to analyze the system you laid 
> out.  It usually takes half an hour to diagram out the read and write 
> procedures and point out where multiple simultaneous ones don’t fit together.
> 
> I can tell you that entire books are written about the difficulties of 
> simultaneous access to a database, and that I’ve read too many of them.  And 
> that it has been proven many times that there’s no solution.  You cannot 
> design a system which (A) provides up-to-date data to readers (B) allows 
> writers to get rid of their data immediately without ever locking up and (C) 
> guarantees that earlier changes to the data are ’saved' before later changes, 
> thus preserving uncorrupted data in the case of power-cuts, etc..
> 
> It is possible to implement a version if you drop one of the requirements.  
> For example, you can have many simultaneous writers as long as you don’t need 
> any readers.  Or you can have many readers as long as you have only one 
> writer and you don’t need readers to be completely up-to-date.
> 
> You might like to read more about Brewer’s Theorem:
> 
> 
> 
> In the meantime, I’m glad that WAL mode seems to be useful for you, if you 
> can cope with big journal files until all connections are closed, it’s a good 
> solution.

What I probably haven't accounted for is what it would take to do an 
atomic/safe swap of the WAL files in my double WAL scenario. I need to give 
that some more thought.

In the single WAL scenario, what I probably really want is a way for the 
checkpoint operation on the write connection to do its work then wait for 
exclusive access — standard reader/writer lock pattern — to do the WAL reset. 
This would presumably limit the time that the readers were blocked since I 
would expect the WAL reset to be reasonably quick if the checkpoint copying 
work were already complete. Furthermore, the write operation would only be 
blocked for the length of the longest outstanding read so as long I favor big 
transactions for writes and short queries for reads, the writer shouldn't be 
blocked for too long either. Are there checkpoint settings that achieve this or 
do I need to build that logic into my code?

Mark
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Simon Slavin

On 29 Nov 2016, at 4:18pm, Mark Hamburg  wrote:

> Does this make sense? Does it seem useful? (It seems useful to me when I see 
> multi-megabyte WAL files.)

Sorry, but I cannot spare the time right now to analyze the system you laid 
out.  It usually takes half an hour to diagram out the read and write 
procedures and point out where multiple simultaneous ones don’t fit together.

I can tell you that entire books are written about the difficulties of 
simultaneous access to a database, and that I’ve read too many of them.  And 
that it has been proven many times that there’s no solution.  You cannot design 
a system which (A) provides up-to-date data to readers (B) allows writers to 
get rid of their data immediately without ever locking up and (C) guarantees 
that earlier changes to the data are ’saved' before later changes, thus 
preserving uncorrupted data in the case of power-cuts, etc..

It is possible to implement a version if you drop one of the requirements.  For 
example, you can have many simultaneous writers as long as you don’t need any 
readers.  Or you can have many readers as long as you have only one writer and 
you don’t need readers to be completely up-to-date.

You might like to read more about Brewer’s Theorem:



In the meantime, I’m glad that WAL mode seems to be useful for you, if you can 
cope with big journal files until all connections are closed, it’s a good 
solution.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Mark Hamburg
One other question about WAL mode and simultaneous readers and writers: How are 
people setting their page caches? My read is that shared cache is probably not 
what's wanted. I was setting my reader caches to be bigger than my writer cache 
under the assumption that writers write and then move on whereas readers would 
benefit from having more data cached, but I'm now thinking that the disk cache 
should be getting me the latter effect and increasing the size of the write 
cache should allow the writer to run longer without having to flush from memory 
to disk. Is there any standard advice in this regard or is this the sort of 
question where the answer is "experiment".

Mark

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-29 Thread Mark Hamburg
Once I figured out how to reliably get the reader and writer connections open 
on a database — key point, when creating the database let the writer get 
through all of the creation work before opening any readers — I've been getting 
great concurrency behavior for simultaneous reads and writes in WAL mode.

What's less great is that if you have enough read activity, the checkpoint 
logic may never be able to actually reset the WAL and the WAL can get very 
large. Basically, if there is a read in process that uses the WAL, it can't get 
reset.

This would obviously be a change to the WAL implementation, but I've been 
thinking that it ought to be possible to use essentially two WALs as 
essentially successive appendages to the main database and to swap their roles 
as the earlier WAL gets written into the main database. In other words 
something like the following:

Logical view: DB ++ WAL_A ++ WAL_B

where ++ is essentially the WAL overlay logic. Writes always go into the second 
WAL. Checkpoints copy from the first WAL into the database. When a checkpoint 
is done, if there are no reads with holds on WAL_A, we can swap the roles of 
the WALs, reset WAL_A which is now the second WAL and hence the target for 
writes and start copying WAL_B which is now the first WAL into the database.

The trick comes in what we do when we start a read. If the first WAL has not 
been fully checkpointed, then we need to grab both WALs. (We could ignore the 
second WAL if it is empty, but that's immaterial to this discussion.) If, 
however, the first WAL has been fully checkpointed, then a read need not grab 
it which then leaves us free to do the WAL swap.

Such a scheme won't help with long reads blocking WAL reset, but it seems like 
it would eliminate issues with having a steady stream of small read operations 
blocking WAL reset.

Does this make sense? Does it seem useful? (It seems useful to me when I see 
multi-megabyte WAL files.)

Mark

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-28 Thread Nico Williams
On Mon, Nov 28, 2016 at 07:45:16PM +, Howard Chu wrote:
> Nico Williams wrote:
> >One of the problems with LMDB's locks is that readers need write
> >permission to the lock file, IIRC :( but at least it's a separate file.
> 
> We developed an alternate locking protocol for Postfix. It just uses
> standard fcntl() locks.

Yeah, I'm aware.

> http://www.postfix.org/lmdb_table.5.html
> 
> Of course nothing comes for free - with this approach, writers don't block
> readers, but readers block writers. That's the best you're going to get
> without custom lock protocols like LMDB uses natively.

TANSTAAFL, I know.  A proper COW DB with coalesce-and-rename-into-place
would need no locking of any kind for readers, at the price of needing
more storage and more I/O to do the vacuuming.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-28 Thread Howard Chu

Nico Williams wrote:

On Fri, Nov 25, 2016 at 09:08:37AM +0100, Florian Weimer wrote:

I think you misunderstood what I was asking (see the start of the thread).
I need to support Byzantine readers which do not follow the locking
protocol.  Based on the documentation, LMDB uses locks to implement MVCC and
prevent premature page reuse.



There's a good use case for an LMDB-like COW DB such that readers need
never lock.  That would require vacuuming by writing a new file and
renaming it into place, which is a trade-off.

One of the problems with LMDB's locks is that readers need write
permission to the lock file, IIRC :( but at least it's a separate file.


We developed an alternate locking protocol for Postfix. It just uses standard 
fcntl() locks.


http://www.postfix.org/lmdb_table.5.html

Of course nothing comes for free - with this approach, writers don't block 
readers, but readers block writers. That's the best you're going to get 
without custom lock protocols like LMDB uses natively.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-28 Thread Nico Williams
On Fri, Nov 25, 2016 at 09:08:37AM +0100, Florian Weimer wrote:
> I think you misunderstood what I was asking (see the start of the thread).
> I need to support Byzantine readers which do not follow the locking
> protocol.  Based on the documentation, LMDB uses locks to implement MVCC and
> prevent premature page reuse.

There's a good use case for an LMDB-like COW DB such that readers need
never lock.  That would require vacuuming by writing a new file and
renaming it into place, which is a trade-off.

One of the problems with LMDB's locks is that readers need write
permission to the lock file, IIRC :( but at least it's a separate file.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-25 Thread Richard Hipp
On 11/24/16, Paul Sanderson  wrote:
> Could you use
>
> PRAGMA data_version
>
> before and after each read to see whether there have been any changes
> to the DB - not surehow this works in WAL mode?

I think that would work as long as the writer never does ROLLBACK.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-25 Thread Richard Hipp
On 11/24/16, Florian Weimer  wrote:
>
> An incorrect answer or SQLITE_CORRUPT is borderline acceptable.  Endless
> loops or crashes would be bad.
>

We do a lot of testing to ensure that SQLite does not crash or get
into endless loops or leak memory when confronted with a corrupt
database file.  (Such testing includes but is not limited to fuzzing
the database file using AFL.)  So you should be ok.  If you do crash,
or loop, or leak memory on a corrupt database file, that is definitely
considered a bug.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-25 Thread Florian Weimer

On 11/25/2016 02:07 AM, Howard Chu wrote:

Florian Weimer wrote:

On 11/24/2016 10:41 PM, Howard Chu wrote:

As a
compromise you could use SQLightning, which replaces SQLite's Btree
layer with LMDB. Since LMDB *does* allow readers that don't block
writers.


How does it do that?  Does LMDB perform lock-free optimistic reads and
retroactively verifies that the entire read operation was consistent?
The web
page currently says that “readers need write access to locks and lock
file”:

  


Readers are lock-free/wait-free. Since LMDB uses MVCC readers get their
own fully isolated snapshot of the DB so no retroactive verification is
needed.


I think you misunderstood what I was asking (see the start of the 
thread).  I need to support Byzantine readers which do not follow the 
locking protocol.  Based on the documentation, LMDB uses locks to 
implement MVCC and prevent premature page reuse.


Thanks,
Florian

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Eduardo Morras
On Thu, 24 Nov 2016 08:54:47 -0500
Richard Hipp  wrote:

> On 11/24/16, Florian Weimer  wrote:
> > I'd like to replace the use of Berkeley DB in RPM with SQLite.
> >
> > The scenario is special in the follow way.  There is no database
> > server, all access goes directly to the database.  Unprivileged
> > users without write access to the RPM database are expected to run
> > read-only queries against the database.  Privileged users
> > (basically, root) is expected to use locking to exclude concurrent
> > writers.  But read-only users should not be able to stop
> > acquisition of a write lock.
> >
> > Is there a way to do this with SQLite?
> 
> The readers can open the database using URI filenames
> (https://www.sqlite.org/uri.html) with query parameters "mode=ro" and
> "locking=0".  That will prevent the readers from blocking the writer.
> But, if a write happens in the middle of a read, the reader might see
> inconsistent data and report SQLITE_CORRUPT.  This is harmless in the
> sense that the database file is not really corrupt (the reader is
> merely seeing parts of the files from two different points in time)
> and subsequent reads should still work.  If you are unlucky, a write
> that happens at the same time as a read might cause the reader to
> return incorrect results, so the reader can never be 100% sure that
> the answer it gets back is correct.

If writer safely forces a schema change, that's no real change to
schema, only an increase  to schema_version pragma; the readers can
check that condition at sqlite3_step.

I don't know how to fire it, and changing directly pragma value is
discouraged.


> 
> How important is it to you that the reader always get a correct
> answer?
> -- 
> D. Richard Hipp
> d...@sqlite.org

---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Howard Chu

Florian Weimer wrote:

On 11/24/2016 10:41 PM, Howard Chu wrote:

As a
compromise you could use SQLightning, which replaces SQLite's Btree
layer with LMDB. Since LMDB *does* allow readers that don't block
writers.


How does it do that?  Does LMDB perform lock-free optimistic reads and
retroactively verifies that the entire read operation was consistent? The web
page currently says that “readers need write access to locks and lock file”:

  


Readers are lock-free/wait-free. Since LMDB uses MVCC readers get their own 
fully isolated snapshot of the DB so no retroactive verification is needed.


The restriction on opening the database twice within the same process is
something which we would have to work around, too.

I see you are working on eliminating the key size limit, which is nice.


Yeah, should be available in a month or so in 1.0 release candidates.

--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Paul Sanderson
Could you use

PRAGMA data_version

before and after each read to see whether there have been any changes
to the DB - not surehow this works in WAL mode?


Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence


On 24 November 2016 at 21:59, Florian Weimer  wrote:
> On 11/24/2016 10:41 PM, Howard Chu wrote:
>>
>> Florian Weimer wrote:
>>>
>>> On 11/24/2016 02:54 PM, Richard Hipp wrote:

 On 11/24/16, Florian Weimer  wrote:
>
> I'd like to replace the use of Berkeley DB in RPM with SQLite.
>
> The scenario is special in the follow way.  There is no database
> server,
> all access goes directly to the database.  Unprivileged users without
> write access to the RPM database are expected to run read-only queries
> against the database.  Privileged users (basically, root) is
> expected to
> use locking to exclude concurrent writers.  But read-only users should
> not be able to stop acquisition of a write lock.
>
> Is there a way to do this with SQLite?
>>
>>
>> Seems like quite a lot of burden to go this route. SQLite has a much
>> larger footprint than BDB, and much worse performance overall.
>
>
> SQLite has zero footprint for us because central parts of the system use it
> as well, and it is not likely to go away.  We also use the full Transaction
> Data Store, so the Berkeley DB shared object is about twice as large as the
> SQLite object.
>
>> As a
>> compromise you could use SQLightning, which replaces SQLite's Btree
>> layer with LMDB. Since LMDB *does* allow readers that don't block
>> writers.
>
>
> How does it do that?  Does LMDB perform lock-free optimistic reads and
> retroactively verifies that the entire read operation was consistent? The
> web page currently says that “readers need write access to locks and lock
> file”:
>
>   
>
> The restriction on opening the database twice within the same process is
> something which we would have to work around, too.
>
> I see you are working on eliminating the key size limit, which is nice.
>
> Thanks,
> Florian
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Florian Weimer

On 11/24/2016 10:41 PM, Howard Chu wrote:

Florian Weimer wrote:

On 11/24/2016 02:54 PM, Richard Hipp wrote:

On 11/24/16, Florian Weimer  wrote:

I'd like to replace the use of Berkeley DB in RPM with SQLite.

The scenario is special in the follow way.  There is no database
server,
all access goes directly to the database.  Unprivileged users without
write access to the RPM database are expected to run read-only queries
against the database.  Privileged users (basically, root) is
expected to
use locking to exclude concurrent writers.  But read-only users should
not be able to stop acquisition of a write lock.

Is there a way to do this with SQLite?


Seems like quite a lot of burden to go this route. SQLite has a much
larger footprint than BDB, and much worse performance overall.


SQLite has zero footprint for us because central parts of the system use 
it as well, and it is not likely to go away.  We also use the full 
Transaction Data Store, so the Berkeley DB shared object is about twice 
as large as the SQLite object.



As a
compromise you could use SQLightning, which replaces SQLite's Btree
layer with LMDB. Since LMDB *does* allow readers that don't block
writers.


How does it do that?  Does LMDB perform lock-free optimistic reads and 
retroactively verifies that the entire read operation was consistent? 
The web page currently says that “readers need write access to locks and 
lock file”:


  

The restriction on opening the database twice within the same process is 
something which we would have to work around, too.


I see you are working on eliminating the key size limit, which is nice.

Thanks,
Florian

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Howard Chu

Florian Weimer wrote:

On 11/24/2016 02:54 PM, Richard Hipp wrote:

On 11/24/16, Florian Weimer  wrote:

I'd like to replace the use of Berkeley DB in RPM with SQLite.

The scenario is special in the follow way.  There is no database server,
all access goes directly to the database.  Unprivileged users without
write access to the RPM database are expected to run read-only queries
against the database.  Privileged users (basically, root) is expected to
use locking to exclude concurrent writers.  But read-only users should
not be able to stop acquisition of a write lock.

Is there a way to do this with SQLite?


Seems like quite a lot of burden to go this route. SQLite has a much larger 
footprint than BDB, and much worse performance overall. As a compromise you 
could use SQLightning, which replaces SQLite's Btree layer with LMDB. Since 
LMDB *does* allow readers that don't block writers. But it would be more 
efficient to just use LMDB directly, and not incur the overhead of the SQL 
translation layer.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Florian Weimer

On 11/24/2016 02:54 PM, Richard Hipp wrote:

On 11/24/16, Florian Weimer  wrote:

I'd like to replace the use of Berkeley DB in RPM with SQLite.

The scenario is special in the follow way.  There is no database server,
all access goes directly to the database.  Unprivileged users without
write access to the RPM database are expected to run read-only queries
against the database.  Privileged users (basically, root) is expected to
use locking to exclude concurrent writers.  But read-only users should
not be able to stop acquisition of a write lock.

Is there a way to do this with SQLite?


The readers can open the database using URI filenames
(https://www.sqlite.org/uri.html) with query parameters "mode=ro" and
"locking=0".  That will prevent the readers from blocking the writer.


Thanks, this looks promising.

For the writers, can I use an external lock file (accessible only by 
root), and specify “locking=0” as well?  I need to avoid stalling on a 
read lock on the database file because any process which can open the 
file could create such a lock, thus blocking regular SQLite operation.



But, if a write happens in the middle of a read, the reader might see
inconsistent data and report SQLITE_CORRUPT.  This is harmless in the
sense that the database file is not really corrupt (the reader is
merely seeing parts of the files from two different points in time)
and subsequent reads should still work.  If you are unlucky, a write
that happens at the same time as a read might cause the reader to
return incorrect results, so the reader can never be 100% sure that
the answer it gets back is correct.


Could I use PRAGMA data_version to detect the case where there answer 
might be silently incorrect?



How important is it to you that the reader always get a correct answer?


An incorrect answer or SQLITE_CORRUPT is borderline acceptable.  Endless 
loops or crashes would be bad.


I need to run a test how often the existing code gets a wrong answer.  I 
can't imagine that it always gets this right because Berkeley DB does 
not really support this use case, either.  Either readers block writers, 
or there will be wrong data occasionally.


Thanks,
Florian
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Simon Slavin

On 24 Nov 2016, at 1:41pm, Florian Weimer  wrote:

> Item 5 says:
> 
> “
> It is not possible to open read-only WAL databases. The opening process must 
> have write privileges for "-shm" wal-index shared memory file associated with 
> the database, if that file exists, or else write access on the directory 
> containing the database file if the "-shm" file does not exist.
> ”
> 
> So WAL mode does not work in this scenario.

Good point.  However, I believe it works fine, in that as long as Unprivileged 
users without write access don’t actually try to write to the database file the 
system works.  However, you may be correct and you should test it under your 
operating system.

One other thing to try would be to have a read/write client open the database 
before any read-only clients need to open it. That would create the .shm file 
meaning that the Unprivileged users don’t have to.  However, I can understand 
if this solution is not suitable for your situation.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Richard Hipp
On 11/24/16, Florian Weimer  wrote:
> I'd like to replace the use of Berkeley DB in RPM with SQLite.
>
> The scenario is special in the follow way.  There is no database server,
> all access goes directly to the database.  Unprivileged users without
> write access to the RPM database are expected to run read-only queries
> against the database.  Privileged users (basically, root) is expected to
> use locking to exclude concurrent writers.  But read-only users should
> not be able to stop acquisition of a write lock.
>
> Is there a way to do this with SQLite?

The readers can open the database using URI filenames
(https://www.sqlite.org/uri.html) with query parameters "mode=ro" and
"locking=0".  That will prevent the readers from blocking the writer.
But, if a write happens in the middle of a read, the reader might see
inconsistent data and report SQLITE_CORRUPT.  This is harmless in the
sense that the database file is not really corrupt (the reader is
merely seeing parts of the files from two different points in time)
and subsequent reads should still work.  If you are unlucky, a write
that happens at the same time as a read might cause the reader to
return incorrect results, so the reader can never be 100% sure that
the answer it gets back is correct.

How important is it to you that the reader always get a correct answer?
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Florian Weimer

On 11/24/2016 01:10 PM, Simon Slavin wrote:


On 24 Nov 2016, at 11:02am, Florian Weimer  wrote:


The scenario is special in the follow way.  There is no database server, all 
access goes directly to the database.  Unprivileged users without write access 
to the RPM database are expected to run read-only queries against the database. 
 Privileged users (basically, root) is expected to use locking to exclude 
concurrent writers.  But read-only users should not be able to stop acquisition 
of a write lock.

Is there a way to do this with SQLite?


From the above you would want to use WAL mode.  You can read about it here:




Item 5 says:

“
It is not possible to open read-only WAL databases. The opening process 
must have write privileges for "-shm" wal-index shared memory file 
associated with the database, if that file exists, or else write access 
on the directory containing the database file if the "-shm" file does 
not exist.

”

So WAL mode does not work in this scenario.

Florian
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Paul Sanderson
Steps 2 and 3 can be swapped

Also you can convert an existing database from one mode to another
(although not, I suspect (I have not tried), in the middle of a
transaction and if in the DB is wal mode and you are changing to
journal then this would force a checkpoint).
Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence


On 24 November 2016 at 12:10, Simon Slavin  wrote:
>
> On 24 Nov 2016, at 11:02am, Florian Weimer  wrote:
>
>> The scenario is special in the follow way.  There is no database server, all 
>> access goes directly to the database.  Unprivileged users without write 
>> access to the RPM database are expected to run read-only queries against the 
>> database.  Privileged users (basically, root) is expected to use locking to 
>> exclude concurrent writers.  But read-only users should not be able to stop 
>> acquisition of a write lock.
>>
>> Is there a way to do this with SQLite?
>
> From the above you would want to use WAL mode.  You can read about it here:
>
> 
>
> To put a database into WAL mode,
>
> 1) Open the database file.
> 2) If the database file is newly created (i.e. blank) Put at least one schema 
> element into the file (e.g. create a TABLE).
> 3) Issue the command "PRAGMA journal_mode=WAL" in any program.
> 4) Close the database file normally.
>
> You can use the SQLite command-line tool to do the above if you wish, rather 
> than having to write your own software to do it.  Once the database is set to 
> this mode, that information is saved in the database file.  All connections 
> which open it will automatically know it must be handled in WAL mode.
>
>> One way that would work is to copy the database file after each modification 
>> to a read-only public view.  But the database can be fairly large, so this 
>> doesn't look feasible until we have reflink support at the file system level.
>
> You should not have to worry about this level of things yourself.  The SQLite 
> library handles this problem for you.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only access which does not block writers

2016-11-24 Thread Simon Slavin

On 24 Nov 2016, at 11:02am, Florian Weimer  wrote:

> The scenario is special in the follow way.  There is no database server, all 
> access goes directly to the database.  Unprivileged users without write 
> access to the RPM database are expected to run read-only queries against the 
> database.  Privileged users (basically, root) is expected to use locking to 
> exclude concurrent writers.  But read-only users should not be able to stop 
> acquisition of a write lock.
> 
> Is there a way to do this with SQLite?

From the above you would want to use WAL mode.  You can read about it here:



To put a database into WAL mode,

1) Open the database file.
2) If the database file is newly created (i.e. blank) Put at least one schema 
element into the file (e.g. create a TABLE).
3) Issue the command "PRAGMA journal_mode=WAL" in any program.  
4) Close the database file normally.

You can use the SQLite command-line tool to do the above if you wish, rather 
than having to write your own software to do it.  Once the database is set to 
this mode, that information is saved in the database file.  All connections 
which open it will automatically know it must be handled in WAL mode.

> One way that would work is to copy the database file after each modification 
> to a read-only public view.  But the database can be fairly large, so this 
> doesn't look feasible until we have reflink support at the file system level.

You should not have to worry about this level of things yourself.  The SQLite 
library handles this problem for you.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Read-only access which does not block writers

2016-11-24 Thread Florian Weimer

I'd like to replace the use of Berkeley DB in RPM with SQLite.

The scenario is special in the follow way.  There is no database server, 
all access goes directly to the database.  Unprivileged users without 
write access to the RPM database are expected to run read-only queries 
against the database.  Privileged users (basically, root) is expected to 
use locking to exclude concurrent writers.  But read-only users should 
not be able to stop acquisition of a write lock.


Is there a way to do this with SQLite?

One way that would work is to copy the database file after each 
modification to a read-only public view.  But the database can be fairly 
large, so this doesn't look feasible until we have reflink support at 
the file system level.


Thanks,
Florian
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users