Re: [sqlite] Concurrent read performance

2013-01-14 Thread Howard Chu

Wayne Bradney wrote:

Howard,Thanks for the links - mdb/sqlightning looks interesting.Before I

dive in, however:1. As I understand it, sqlightning is a drop-in replacement
for SQLite.Interop.dll, and I can still use System.Data.SQLite as my
high-level interface?

Right.

2. Is there a compiled release available?

No.

3. How often is sqlightning updated against SQLite? We're at SQLite 3.7.15.2 /
System.Data.SQLite 1.0.84.0 right now

There is no set schedule, basically as interest and demand dictate.

4. Does sqlightning allow lock-free
reads, even against a SQLite memory-backed, shared cache database?

No. But you can get the same effect simply by putting the MDB database onto a 
RAMdisk.



You are putting programming effort into making your code fast, and this is 
costing you (or your employer) programmer time.


For any reasonably useful piece of software, every moment of programmer time
invested in proper coding saves eons of user time. Putting programmer effort
into making correct code fast is always The Right Thing to Do. Software that
delivers the correct answer, late, is still wrong.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-14 Thread Wayne Bradney
Howard,Thanks for the links - mdb/sqlightning looks interesting.Before I dive 
in, however:1. As I understand it, sqlightning is a drop-in replacement for 
SQLite.Interop.dll, and I can still use System.Data.SQLite as my high-level 
interface?2. Is there a compiled release available?3. How often is sqlightning 
updated against SQLite? We're at SQLite 3.7.15.2 / System.Data.SQLite 1.0.84.0 
right now4. Does sqlightning allow lock-free reads, even against a SQLite 
memory-backed, shared cache database?
> Date: Sun, 13 Jan 2013 10:25:06 -0800
> From: h...@symas.com
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Concurrent read performance
> 
> Wayne Bradney wrote:
> >>> All access in SQLite is serialized. Apologies if I'm missing something 
> >>> fundamental here, but that's not what I'm seeing with a file-backed 
> >>> database when shared cache is OFF.My test has a single table with 1M 
> >>> rows, and four queries that each yield 100K different rows. I run them 
> >>> two ways: 1. All queries in a loop on the same thread in the same 
> >>> connection.2. Each query in parallel on separate threads, each with its 
> >>> own connection. If all access were serialized, I would expect these two 
> >>> tests to take about the same amount of time overall, wouldn't I?In fact, 
> >>> with a file-backed database and shared cache OFF, the second run takes 
> >>> about 70% less time.With shared cache ON, they're the same. As to your 
> >>> second point, I probably should have made it clear that this isn't an 
> >>> internal project, it's a software product, and we don't control where it 
> >>> runs. I understand what an SSD is and why it's better than a spindle 
> >>> drive, but my question wasn't really meant to solicit suggestions for 
> >>> perfor
 ma
>  n
> c
> >   e improvements outside the proposal at hand, which was to retire our 
> > existing home-grown in-memory cache implementation (which is very fast for 
> > concurrent reads, but is extremely limited in how it can be queried), and 
> > replace it with a SQL-capable, relational store and still get roughly the 
> > same performance. Our expectation was that we could achieve this with 
> > SQLite, but were surprised by the apparent lack of read-concurrency, and 
> > wanted to get some input on what our options might be in terms of SQLite 
> > configuration of memory-backed databases. > From: slav...@bigfraud.org
> 
> You should look into MDB, which does no locking for read operations. Reads 
> scale perfectly across arbitrarily many CPUs. More info here
> 
> http://symas.com/mdb/
> 
> and SQLite ported to use MDB as its backend is available here
> 
> https://gitorious.org/mdb/sqlightning
> 
> >> Date: Sat, 12 Jan 2013 17:48:56 +
> >> To: sqlite-users@sqlite.org
> >> Subject: Re: [sqlite] Concurrent read performance
> >>
> >>
> >> On 12 Jan 2013, at 5:38pm, Wayne Bradney <wayne_brad...@yahoo.com> wrote:
> >>
> >>> "mode=memory=shared"
> >>
> >>
> >>> 1. when shared cache is enabled, all reads are serialized, and
> >>
> >> All access in SQLite is serialised.  All transactions require locking the 
> >> entire database.  SQLite is very simple -- 'lite' -- so queries run 
> >> extremely quickly, so you don't normally realise that any locking has 
> >> taken place.
> >>
> >>> 2. there doesn't seem to be any way to have a memory-backed database that 
> >>> can be accessed by multiple connections without using a shared cache,  
> >>> then I guess I MUST use a file-backed database to get concurrent reads, 
> >>> even though I don't need the persistence and don't want to take the I/O 
> >>> hit. Am I making any sense? Anything I'm missing?
> >>
> >> You are putting programming effort into making your code fast, and this is 
> >> costing you (or your employer) programmer time.
> 
> For any reasonably useful piece of software, every moment of programmer time 
> invested in proper coding saves eons of user time. Putting programmer effort 
> into making correct code fast is always The Right Thing to Do. Software that 
> delivers the correct answer, late, is still wrong.
> 
> -- 
>-- Howard Chu
>CTO, Symas Corp.   http://www.symas.com
>Director, Highland Sun http://highlandsun.com/hyc/
>Chief Architect, OpenLDAP  http://www.openldap.org/project/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-13 Thread Howard Chu

Wayne Bradney wrote:

All access in SQLite is serialized. Apologies if I'm missing something 
fundamental here, but that's not what I'm seeing with a file-backed database 
when shared cache is OFF.My test has a single table with 1M rows, and four 
queries that each yield 100K different rows. I run them two ways: 1. All 
queries in a loop on the same thread in the same connection.2. Each query in 
parallel on separate threads, each with its own connection. If all access were 
serialized, I would expect these two tests to take about the same amount of 
time overall, wouldn't I?In fact, with a file-backed database and shared cache 
OFF, the second run takes about 70% less time.With shared cache ON, they're the 
same. As to your second point, I probably should have made it clear that this 
isn't an internal project, it's a software product, and we don't control where 
it runs. I understand what an SSD is and why it's better than a spindle drive, 
but my question wasn't really meant to solicit suggestions for performa

n
c

  e improvements outside the proposal at hand, which was to retire our existing 
home-grown in-memory cache implementation (which is very fast for concurrent 
reads, but is extremely limited in how it can be queried), and replace it with a 
SQL-capable, relational store and still get roughly the same performance. Our 
expectation was that we could achieve this with SQLite, but were surprised by the 
apparent lack of read-concurrency, and wanted to get some input on what our 
options might be in terms of SQLite configuration of memory-backed databases. > 
From: slav...@bigfraud.org


You should look into MDB, which does no locking for read operations. Reads 
scale perfectly across arbitrarily many CPUs. More info here


http://symas.com/mdb/

and SQLite ported to use MDB as its backend is available here

https://gitorious.org/mdb/sqlightning


Date: Sat, 12 Jan 2013 17:48:56 +
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Concurrent read performance


On 12 Jan 2013, at 5:38pm, Wayne Bradney <wayne_brad...@yahoo.com> wrote:


"mode=memory=shared"




1. when shared cache is enabled, all reads are serialized, and


All access in SQLite is serialised.  All transactions require locking the 
entire database.  SQLite is very simple -- 'lite' -- so queries run extremely 
quickly, so you don't normally realise that any locking has taken place.


2. there doesn't seem to be any way to have a memory-backed database that can 
be accessed by multiple connections without using a shared cache,  then I guess 
I MUST use a file-backed database to get concurrent reads, even though I don't 
need the persistence and don't want to take the I/O hit. Am I making any sense? 
Anything I'm missing?


You are putting programming effort into making your code fast, and this is 
costing you (or your employer) programmer time.


For any reasonably useful piece of software, every moment of programmer time 
invested in proper coding saves eons of user time. Putting programmer effort 
into making correct code fast is always The Right Thing to Do. Software that 
delivers the correct answer, late, is still wrong.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Wayne Bradney
Keith,I started Experiment 5 soon after the original post, which was to do as 
you suggest and run concurrent queries on a single connection shared across 
multiple threads. For both shared cache databases (file- and memory-backed) it 
didn't seem to make any difference in my test case -- queries still appear to 
be serialized. Would this approach require SQLITE_CONFIG_MULTITHREAD or 
SQLITE_OPEN_NOMUTEX, I wonder?
 > Date: Sat, 12 Jan 2013 13:39:35 -0700
> From: kmedc...@dessus.com
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Concurrent read performance
> 
> 
> If your application can reasonably multi-leave "in-engine" operations -vs- 
> out-of-engine operations, then you might want to try multiple threads against 
> a single connection.  In any case, make sure that you are not opening and 
> closing connections.  Open the connections when your application starts, and 
> close them when it is done.  The page cache will have almost no effect if you 
> delete it between each use (think of the page cache as loading the database 
> into RAM -- when you open a connection, the entire database must be copied 
> into RAM -- when you close the connection, you are discarding that copy.  If 
> you then open another connection, you have to load all your data from disk 
> into that connection too.  As you can see, opening and closing connections 
> will incur a significant overhead to repetitively read data which should 
> already be in the cache, were it not for you discarding it from memory.)  As 
> long as you are properly resetting and disposing of your prepared statements 
> and make 
 su
>  re that you do not keep "unnecessary crap" (like forgetting to commit 
> transactions or reset statements), keeping a connection open for 
> minutes/hours/days/years/decades/centuries does not have any ill effect 
> (though it will make your programming errors jump out at you, often with odd 
> behaviour and crashes).
> 
> ---
> ()  ascii ribbon campaign against html e-mail
> /\  www.asciiribbon.org
> 
> 
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> > boun...@sqlite.org] On Behalf Of Wayne Bradney
> > Sent: Saturday, 12 January, 2013 12:36
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] Concurrent read performance
> > 
> > Increasing cache_size didn't seem to have an effect. I think I'm going to
> > lean towards taking the 15% or so hit and use a file-backed db without 
> > shared
> > cache for now -- I'll have to see what the impact of that will be on write
> > performance. If that's OK, then maybe a ramdisk will get back some of that
> > performance. I guess I was hoping that a memory-backed db would simply 
> > behave
> > exactly like a ramdisk in the first place, but the shared cache requirement
> > kills us.
> >  > From: mdblac...@yahoo.com
> > > To: sqlite-users@sqlite.org
> > > Date: Sat, 12 Jan 2013 12:02:25 -0600
> > > Subject: Re: [sqlite] Concurrent read performance
> > >
> > > Also...does increasing cache_size help?
> > > Are you able to use a RAM disk?
> > >
> > >
> > > -Original Message-
> > > From: sqlite-users-boun...@sqlite.org
> > > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
> > > Sent: Saturday, January 12, 2013 11:39 AM
> > > To: sqlite-users@sqlite.org
> > > Subject: [sqlite] Concurrent read performance
> > >
> > > I have a requirement for which I'm proposing SQLite as a solution, and
> > would
> > > appreciate some feedback to clarify my thinking. The application is a
> > shared
> > > service (.NET/WCF) that caches relational data in-memory and serves it up
> > > when (read-only) requests come in (via operation contracts) from multiple
> > > clients. Such client requests are high-level, not relational, and could
> > > result in several (potentially many) individual queries to the cache
> > itself.
> > > These individual cache queries are dynamically generated and could be
> > > arbitrarily complex, but are all essentially relational in nature. The
> > > service itself will periodically load data from an external data source,
> > > transform it and update the cache. There's no requirement currently for 
> > > the
> > > cache to ever be persisted - it can be reloaded from the external source 
> > > if
> > > necessary, but performance (especially read performance) is critical. The
> > > amount of data/indexes to cache potentially could be quite large (of the
> > > order of severa

Re: [sqlite] Concurrent read performance

2013-01-12 Thread Keith Medcalf

If your application can reasonably multi-leave "in-engine" operations -vs- 
out-of-engine operations, then you might want to try multiple threads against a 
single connection.  In any case, make sure that you are not opening and closing 
connections.  Open the connections when your application starts, and close them 
when it is done.  The page cache will have almost no effect if you delete it 
between each use (think of the page cache as loading the database into RAM -- 
when you open a connection, the entire database must be copied into RAM -- when 
you close the connection, you are discarding that copy.  If you then open 
another connection, you have to load all your data from disk into that 
connection too.  As you can see, opening and closing connections will incur a 
significant overhead to repetitively read data which should already be in the 
cache, were it not for you discarding it from memory.)  As long as you are 
properly resetting and disposing of your prepared statements and make su
 re that you do not keep "unnecessary crap" (like forgetting to commit 
transactions or reset statements), keeping a connection open for 
minutes/hours/days/years/decades/centuries does not have any ill effect (though 
it will make your programming errors jump out at you, often with odd behaviour 
and crashes).

---
()  ascii ribbon campaign against html e-mail
/\  www.asciiribbon.org


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Wayne Bradney
> Sent: Saturday, 12 January, 2013 12:36
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Concurrent read performance
> 
> Increasing cache_size didn't seem to have an effect. I think I'm going to
> lean towards taking the 15% or so hit and use a file-backed db without shared
> cache for now -- I'll have to see what the impact of that will be on write
> performance. If that's OK, then maybe a ramdisk will get back some of that
> performance. I guess I was hoping that a memory-backed db would simply behave
> exactly like a ramdisk in the first place, but the shared cache requirement
> kills us.
>  > From: mdblac...@yahoo.com
> > To: sqlite-users@sqlite.org
> > Date: Sat, 12 Jan 2013 12:02:25 -0600
> > Subject: Re: [sqlite] Concurrent read performance
> >
> > Also...does increasing cache_size help?
> > Are you able to use a RAM disk?
> >
> >
> > -Original Message-
> > From: sqlite-users-boun...@sqlite.org
> > [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
> > Sent: Saturday, January 12, 2013 11:39 AM
> > To: sqlite-users@sqlite.org
> > Subject: [sqlite] Concurrent read performance
> >
> > I have a requirement for which I'm proposing SQLite as a solution, and
> would
> > appreciate some feedback to clarify my thinking. The application is a
> shared
> > service (.NET/WCF) that caches relational data in-memory and serves it up
> > when (read-only) requests come in (via operation contracts) from multiple
> > clients. Such client requests are high-level, not relational, and could
> > result in several (potentially many) individual queries to the cache
> itself.
> > These individual cache queries are dynamically generated and could be
> > arbitrarily complex, but are all essentially relational in nature. The
> > service itself will periodically load data from an external data source,
> > transform it and update the cache. There's no requirement currently for the
> > cache to ever be persisted - it can be reloaded from the external source if
> > necessary, but performance (especially read performance) is critical. The
> > amount of data/indexes to cache potentially could be quite large (of the
> > order of several gigabytes, let's
> >  say). I've already worked an initial implementation that uses an in-memory
> > SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
> > "mode=memory=shared" database, and each SQL query happens on its own
> > connection, and in its own thread. Some observations of our read
> performance
> > (when the cache is fully populated and there are no writers): [Experiment
> 1:
> > Memory-backed, single query]
> > For simple client requests that only result in a single (albeit complex)
> SQL
> > query to the database, performance is excellent, and I'm very happy to get
> > the maintenance benefits of using a flexible query language against the
> > cache. [Experiment 2: Memory-backed, concurrent queries]
> > For any client request that results in multiple simultaneous SQL queries to
> > the database, those queries seem to be serialized rather than concurrent,
> > an

Re: [sqlite] Concurrent read performance

2013-01-12 Thread Michael Black
What about using 2 or more databases?

Wayne Bradney <wayne_brad...@yahoo.com> wrote:

>>>All access in SQLite is serialized. Apologies if I'm missing something 
>>>fundamental here, but that's not what I'm seeing with a file-backed database 
>>>when shared cache is OFF.My test has a single table with 1M rows, and four 
>>>queries that each yield 100K different rows. I run them two ways: 1. All 
>>>queries in a loop on the same thread in the same connection.2. Each query in 
>>>parallel on separate threads, each with its own connection. If all access 
>>>were serialized, I would expect these two tests to take about the same 
>>>amount of time overall, wouldn't I?In fact, with a file-backed database and 
>>>shared cache OFF, the second run takes about 70% less time.With shared cache 
>>>ON, they're the same. As to your second point, I probably should have made 
>>>it clear that this isn't an internal project, it's a software product, and 
>>>we don't control where it runs. I understand what an SSD is and why it's 
>>>better than a spindle drive, but my question wasn't really meant to solicit 
>>>suggestions for performan
 c
> e improvements outside the proposal at hand, which was to retire our existing 
> home-grown in-memory cache implementation (which is very fast for concurrent 
> reads, but is extremely limited in how it can be queried), and replace it 
> with a SQL-capable, relational store and still get roughly the same 
> performance. Our expectation was that we could achieve this with SQLite, but 
> were surprised by the apparent lack of read-concurrency, and wanted to get 
> some input on what our options might be in terms of SQLite configuration of 
> memory-backed databases. > From: slav...@bigfraud.org
>> Date: Sat, 12 Jan 2013 17:48:56 +
>> To: sqlite-users@sqlite.org
>> Subject: Re: [sqlite] Concurrent read performance
>> 
>> 
>> On 12 Jan 2013, at 5:38pm, Wayne Bradney <wayne_brad...@yahoo.com> wrote:
>> 
>> > "mode=memory=shared"
>> 
>> 
>> > 1. when shared cache is enabled, all reads are serialized, and
>> 
>> All access in SQLite is serialised.  All transactions require locking the 
>> entire database.  SQLite is very simple -- 'lite' -- so queries run 
>> extremely quickly, so you don't normally realise that any locking has taken 
>> place.
>> 
>> > 2. there doesn't seem to be any way to have a memory-backed database that 
>> > can be accessed by multiple connections without using a shared cache,  
>> > then I guess I MUST use a file-backed database to get concurrent reads, 
>> > even though I don't need the persistence and don't want to take the I/O 
>> > hit. Am I making any sense? Anything I'm missing? 
>> 
>> You are putting programming effort into making your code fast, and this is 
>> costing you (or your employer) programmer time.  Have you tried doing this 
>> using an SSD instead of a spinning disk ?  A great deal of the time taken 
>> for on-disk SQLite is waiting for the disk to spin to the right place.  With 
>> an SSD all this time vanishes and access is nearly as fast as for in-memory 
>> databases.  The advantage is that you don't spend your time on clever 
>> optimal programming or have to do any of the things required for 
>> 'mode=memory'.  In fact it works very quickly without any special modes or 
>> PRAGMAs at all.  Though I don't know your setup in detail and it may not be 
>> of such a great advantage to you.
>> 
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Wayne Bradney
Increasing cache_size didn't seem to have an effect. I think I'm going to lean 
towards taking the 15% or so hit and use a file-backed db without shared cache 
for now -- I'll have to see what the impact of that will be on write 
performance. If that's OK, then maybe a ramdisk will get back some of that 
performance. I guess I was hoping that a memory-backed db would simply behave 
exactly like a ramdisk in the first place, but the shared cache requirement 
kills us.
 > From: mdblac...@yahoo.com
> To: sqlite-users@sqlite.org
> Date: Sat, 12 Jan 2013 12:02:25 -0600
> Subject: Re: [sqlite] Concurrent read performance
> 
> Also...does increasing cache_size help?
> Are you able to use a RAM disk?
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
> Sent: Saturday, January 12, 2013 11:39 AM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Concurrent read performance
> 
> I have a requirement for which I'm proposing SQLite as a solution, and would
> appreciate some feedback to clarify my thinking. The application is a shared
> service (.NET/WCF) that caches relational data in-memory and serves it up
> when (read-only) requests come in (via operation contracts) from multiple
> clients. Such client requests are high-level, not relational, and could
> result in several (potentially many) individual queries to the cache itself.
> These individual cache queries are dynamically generated and could be
> arbitrarily complex, but are all essentially relational in nature. The
> service itself will periodically load data from an external data source,
> transform it and update the cache. There's no requirement currently for the
> cache to ever be persisted - it can be reloaded from the external source if
> necessary, but performance (especially read performance) is critical. The
> amount of data/indexes to cache potentially could be quite large (of the
> order of several gigabytes, let's 
>  say). I've already worked an initial implementation that uses an in-memory
> SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
> "mode=memory=shared" database, and each SQL query happens on its own
> connection, and in its own thread. Some observations of our read performance
> (when the cache is fully populated and there are no writers): [Experiment 1:
> Memory-backed, single query]
> For simple client requests that only result in a single (albeit complex) SQL
> query to the database, performance is excellent, and I'm very happy to get
> the maintenance benefits of using a flexible query language against the
> cache. [Experiment 2: Memory-backed, concurrent queries]
> For any client request that results in multiple simultaneous SQL queries to
> the database, those queries seem to be serialized rather than concurrent,
> and the whole request actually performs much worse than the old
> home-grown-but-horribly-limited caching/querying mechanism that was in place
> beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries,
> with shared cache]
> I switched to a file-backed database (but still "cache=shared") and it
> appears that the queries are still being serialized, and is overall about
> 15% slower than Experiment 2. [Experiment 4: File-backed, concurrent
> queries, without shared cache]
> I switched to a file-backed database without a shared cache, and performance
> improved dramatically (about 70% faster than Experiment 3). It appears that
> the queries are now truly happening concurrently. So it appears that, since:
> 1. when shared cache is enabled, all reads are serialized, and
> 2. there doesn't seem to be any way to have a memory-backed database that
> can be accessed by multiple connections without using a shared cache,  then
> I guess I MUST use a file-backed database to get concurrent reads, even
> though I don't need the persistence and don't want to take the I/O hit. Am I
> making any sense? Anything I'm missing?
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Wayne Bradney
>>All access in SQLite is serialized. Apologies if I'm missing something 
>>fundamental here, but that's not what I'm seeing with a file-backed database 
>>when shared cache is OFF.My test has a single table with 1M rows, and four 
>>queries that each yield 100K different rows. I run them two ways: 1. All 
>>queries in a loop on the same thread in the same connection.2. Each query in 
>>parallel on separate threads, each with its own connection. If all access 
>>were serialized, I would expect these two tests to take about the same amount 
>>of time overall, wouldn't I?In fact, with a file-backed database and shared 
>>cache OFF, the second run takes about 70% less time.With shared cache ON, 
>>they're the same. As to your second point, I probably should have made it 
>>clear that this isn't an internal project, it's a software product, and we 
>>don't control where it runs. I understand what an SSD is and why it's better 
>>than a spindle drive, but my question wasn't really meant to solicit 
>>suggestions for performanc
 e improvements outside the proposal at hand, which was to retire our existing 
home-grown in-memory cache implementation (which is very fast for concurrent 
reads, but is extremely limited in how it can be queried), and replace it with 
a SQL-capable, relational store and still get roughly the same performance. Our 
expectation was that we could achieve this with SQLite, but were surprised by 
the apparent lack of read-concurrency, and wanted to get some input on what our 
options might be in terms of SQLite configuration of memory-backed databases. > 
From: slav...@bigfraud.org
> Date: Sat, 12 Jan 2013 17:48:56 +
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Concurrent read performance
> 
> 
> On 12 Jan 2013, at 5:38pm, Wayne Bradney <wayne_brad...@yahoo.com> wrote:
> 
> > "mode=memory=shared"
> 
> 
> > 1. when shared cache is enabled, all reads are serialized, and
> 
> All access in SQLite is serialised.  All transactions require locking the 
> entire database.  SQLite is very simple -- 'lite' -- so queries run extremely 
> quickly, so you don't normally realise that any locking has taken place.
> 
> > 2. there doesn't seem to be any way to have a memory-backed database that 
> > can be accessed by multiple connections without using a shared cache,  then 
> > I guess I MUST use a file-backed database to get concurrent reads, even 
> > though I don't need the persistence and don't want to take the I/O hit. Am 
> > I making any sense? Anything I'm missing? 
> 
> You are putting programming effort into making your code fast, and this is 
> costing you (or your employer) programmer time.  Have you tried doing this 
> using an SSD instead of a spinning disk ?  A great deal of the time taken for 
> on-disk SQLite is waiting for the disk to spin to the right place.  With an 
> SSD all this time vanishes and access is nearly as fast as for in-memory 
> databases.  The advantage is that you don't spend your time on clever optimal 
> programming or have to do any of the things required for 'mode=memory'.  In 
> fact it works very quickly without any special modes or PRAGMAs at all.  
> Though I don't know your setup in detail and it may not be of such a great 
> advantage to you.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Wayne Bradney
read_uncommitted didn't seem to have any effect
 > From: mdblac...@yahoo.com
> To: sqlite-users@sqlite.org
> Date: Sat, 12 Jan 2013 11:47:55 -0600
> Subject: Re: [sqlite] Concurrent read performance
> 
> Did you try read-uncommitted?
> Sounds promising...
> 
> 2.2.1
> http://www.sqlite.org/sharedcache.html
> 
> PRAGMA read_uncommitted = ;
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
> Sent: Saturday, January 12, 2013 11:39 AM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Concurrent read performance
> 
> I have a requirement for which I'm proposing SQLite as a solution, and would
> appreciate some feedback to clarify my thinking. The application is a shared
> service (.NET/WCF) that caches relational data in-memory and serves it up
> when (read-only) requests come in (via operation contracts) from multiple
> clients. Such client requests are high-level, not relational, and could
> result in several (potentially many) individual queries to the cache itself.
> These individual cache queries are dynamically generated and could be
> arbitrarily complex, but are all essentially relational in nature. The
> service itself will periodically load data from an external data source,
> transform it and update the cache. There's no requirement currently for the
> cache to ever be persisted - it can be reloaded from the external source if
> necessary, but performance (especially read performance) is critical. The
> amount of data/indexes to cache potentially could be quite large (of the
> order of several gigabytes, let's 
>  say). I've already worked an initial implementation that uses an in-memory
> SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
> "mode=memory=shared" database, and each SQL query happens on its own
> connection, and in its own thread. Some observations of our read performance
> (when the cache is fully populated and there are no writers): [Experiment 1:
> Memory-backed, single query]
> For simple client requests that only result in a single (albeit complex) SQL
> query to the database, performance is excellent, and I'm very happy to get
> the maintenance benefits of using a flexible query language against the
> cache. [Experiment 2: Memory-backed, concurrent queries]
> For any client request that results in multiple simultaneous SQL queries to
> the database, those queries seem to be serialized rather than concurrent,
> and the whole request actually performs much worse than the old
> home-grown-but-horribly-limited caching/querying mechanism that was in place
> beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries,
> with shared cache]
> I switched to a file-backed database (but still "cache=shared") and it
> appears that the queries are still being serialized, and is overall about
> 15% slower than Experiment 2. [Experiment 4: File-backed, concurrent
> queries, without shared cache]
> I switched to a file-backed database without a shared cache, and performance
> improved dramatically (about 70% faster than Experiment 3). It appears that
> the queries are now truly happening concurrently. So it appears that, since:
> 1. when shared cache is enabled, all reads are serialized, and
> 2. there doesn't seem to be any way to have a memory-backed database that
> can be accessed by multiple connections without using a shared cache,  then
> I guess I MUST use a file-backed database to get concurrent reads, even
> though I don't need the persistence and don't want to take the I/O hit. Am I
> making any sense? Anything I'm missing?
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Clemens Ladisch
Wayne Bradney wrote:
> 1. when shared cache is enabled, all reads are serialized,

Yes.

> I guess I MUST use a file-backed database to get concurrent reads,
> even though I don't need the persistence and don't want to take the
> I/O hit.

If the in-memory database works, you do have enough memory for the data,
so you could rely on the OS file cache to keep the data in memory.

If you don't trust the OS, you could put the DB on a RAM disk (like
ImDisk).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Michael Black
Also...does increasing cache_size help?
Are you able to use a RAM disk?


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
Sent: Saturday, January 12, 2013 11:39 AM
To: sqlite-users@sqlite.org
Subject: [sqlite] Concurrent read performance

I have a requirement for which I'm proposing SQLite as a solution, and would
appreciate some feedback to clarify my thinking. The application is a shared
service (.NET/WCF) that caches relational data in-memory and serves it up
when (read-only) requests come in (via operation contracts) from multiple
clients. Such client requests are high-level, not relational, and could
result in several (potentially many) individual queries to the cache itself.
These individual cache queries are dynamically generated and could be
arbitrarily complex, but are all essentially relational in nature. The
service itself will periodically load data from an external data source,
transform it and update the cache. There's no requirement currently for the
cache to ever be persisted - it can be reloaded from the external source if
necessary, but performance (especially read performance) is critical. The
amount of data/indexes to cache potentially could be quite large (of the
order of several gigabytes, let's 
 say). I've already worked an initial implementation that uses an in-memory
SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
"mode=memory=shared" database, and each SQL query happens on its own
connection, and in its own thread. Some observations of our read performance
(when the cache is fully populated and there are no writers): [Experiment 1:
Memory-backed, single query]
For simple client requests that only result in a single (albeit complex) SQL
query to the database, performance is excellent, and I'm very happy to get
the maintenance benefits of using a flexible query language against the
cache. [Experiment 2: Memory-backed, concurrent queries]
For any client request that results in multiple simultaneous SQL queries to
the database, those queries seem to be serialized rather than concurrent,
and the whole request actually performs much worse than the old
home-grown-but-horribly-limited caching/querying mechanism that was in place
beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries,
with shared cache]
I switched to a file-backed database (but still "cache=shared") and it
appears that the queries are still being serialized, and is overall about
15% slower than Experiment 2. [Experiment 4: File-backed, concurrent
queries, without shared cache]
I switched to a file-backed database without a shared cache, and performance
improved dramatically (about 70% faster than Experiment 3). It appears that
the queries are now truly happening concurrently. So it appears that, since:
1. when shared cache is enabled, all reads are serialized, and
2. there doesn't seem to be any way to have a memory-backed database that
can be accessed by multiple connections without using a shared cache,  then
I guess I MUST use a file-backed database to get concurrent reads, even
though I don't need the persistence and don't want to take the I/O hit. Am I
making any sense? Anything I'm missing?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Simon Slavin

On 12 Jan 2013, at 5:38pm, Wayne Bradney  wrote:

> "mode=memory=shared"


> 1. when shared cache is enabled, all reads are serialized, and

All access in SQLite is serialised.  All transactions require locking the 
entire database.  SQLite is very simple -- 'lite' -- so queries run extremely 
quickly, so you don't normally realise that any locking has taken place.

> 2. there doesn't seem to be any way to have a memory-backed database that can 
> be accessed by multiple connections without using a shared cache,  then I 
> guess I MUST use a file-backed database to get concurrent reads, even though 
> I don't need the persistence and don't want to take the I/O hit. Am I making 
> any sense? Anything I'm missing? 

You are putting programming effort into making your code fast, and this is 
costing you (or your employer) programmer time.  Have you tried doing this 
using an SSD instead of a spinning disk ?  A great deal of the time taken for 
on-disk SQLite is waiting for the disk to spin to the right place.  With an SSD 
all this time vanishes and access is nearly as fast as for in-memory databases. 
 The advantage is that you don't spend your time on clever optimal programming 
or have to do any of the things required for 'mode=memory'.  In fact it works 
very quickly without any special modes or PRAGMAs at all.  Though I don't know 
your setup in detail and it may not be of such a great advantage to you.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Concurrent read performance

2013-01-12 Thread Michael Black
Did you try read-uncommitted?
Sounds promising...

2.2.1
http://www.sqlite.org/sharedcache.html

PRAGMA read_uncommitted = ;


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Wayne Bradney
Sent: Saturday, January 12, 2013 11:39 AM
To: sqlite-users@sqlite.org
Subject: [sqlite] Concurrent read performance

I have a requirement for which I'm proposing SQLite as a solution, and would
appreciate some feedback to clarify my thinking. The application is a shared
service (.NET/WCF) that caches relational data in-memory and serves it up
when (read-only) requests come in (via operation contracts) from multiple
clients. Such client requests are high-level, not relational, and could
result in several (potentially many) individual queries to the cache itself.
These individual cache queries are dynamically generated and could be
arbitrarily complex, but are all essentially relational in nature. The
service itself will periodically load data from an external data source,
transform it and update the cache. There's no requirement currently for the
cache to ever be persisted - it can be reloaded from the external source if
necessary, but performance (especially read performance) is critical. The
amount of data/indexes to cache potentially could be quite large (of the
order of several gigabytes, let's 
 say). I've already worked an initial implementation that uses an in-memory
SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a
"mode=memory=shared" database, and each SQL query happens on its own
connection, and in its own thread. Some observations of our read performance
(when the cache is fully populated and there are no writers): [Experiment 1:
Memory-backed, single query]
For simple client requests that only result in a single (albeit complex) SQL
query to the database, performance is excellent, and I'm very happy to get
the maintenance benefits of using a flexible query language against the
cache. [Experiment 2: Memory-backed, concurrent queries]
For any client request that results in multiple simultaneous SQL queries to
the database, those queries seem to be serialized rather than concurrent,
and the whole request actually performs much worse than the old
home-grown-but-horribly-limited caching/querying mechanism that was in place
beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries,
with shared cache]
I switched to a file-backed database (but still "cache=shared") and it
appears that the queries are still being serialized, and is overall about
15% slower than Experiment 2. [Experiment 4: File-backed, concurrent
queries, without shared cache]
I switched to a file-backed database without a shared cache, and performance
improved dramatically (about 70% faster than Experiment 3). It appears that
the queries are now truly happening concurrently. So it appears that, since:
1. when shared cache is enabled, all reads are serialized, and
2. there doesn't seem to be any way to have a memory-backed database that
can be accessed by multiple connections without using a shared cache,  then
I guess I MUST use a file-backed database to get concurrent reads, even
though I don't need the persistence and don't want to take the I/O hit. Am I
making any sense? Anything I'm missing?

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Concurrent read performance

2013-01-12 Thread Wayne Bradney
I have a requirement for which I'm proposing SQLite as a solution, and would 
appreciate some feedback to clarify my thinking. The application is a shared 
service (.NET/WCF) that caches relational data in-memory and serves it up when 
(read-only) requests come in (via operation contracts) from multiple clients. 
Such client requests are high-level, not relational, and could result in 
several (potentially many) individual queries to the cache itself. These 
individual cache queries are dynamically generated and could be arbitrarily 
complex, but are all essentially relational in nature. The service itself will 
periodically load data from an external data source, transform it and update 
the cache. There's no requirement currently for the cache to ever be persisted 
- it can be reloaded from the external source if necessary, but performance 
(especially read performance) is critical. The amount of data/indexes to cache 
potentially could be quite large (of the order of several gigabytes, let's 
 say). I've already worked an initial implementation that uses an in-memory 
SQLite database via System.Data.SQLite (1.0.82.0). The service maintains a 
"mode=memory=shared" database, and each SQL query happens on its own 
connection, and in its own thread. Some observations of our read performance 
(when the cache is fully populated and there are no writers): [Experiment 1: 
Memory-backed, single query]
For simple client requests that only result in a single (albeit complex) SQL 
query to the database, performance is excellent, and I'm very happy to get the 
maintenance benefits of using a flexible query language against the cache. 
[Experiment 2: Memory-backed, concurrent queries]
For any client request that results in multiple simultaneous SQL queries to the 
database, those queries seem to be serialized rather than concurrent, and the 
whole request actually performs much worse than the old 
home-grown-but-horribly-limited caching/querying mechanism that was in place 
beforehand, and I'm sad. [Experiment 3: File-backed, concurrent queries, with 
shared cache]
I switched to a file-backed database (but still "cache=shared") and it appears 
that the queries are still being serialized, and is overall about 15% slower 
than Experiment 2. [Experiment 4: File-backed, concurrent queries, without 
shared cache]
I switched to a file-backed database without a shared cache, and performance 
improved dramatically (about 70% faster than Experiment 3). It appears that the 
queries are now truly happening concurrently. So it appears that, since: 1. 
when shared cache is enabled, all reads are serialized, and
2. there doesn't seem to be any way to have a memory-backed database that can 
be accessed by multiple connections without using a shared cache,  then I guess 
I MUST use a file-backed database to get concurrent reads, even though I don't 
need the persistence and don't want to take the I/O hit. Am I making any sense? 
Anything I'm missing?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users